Activity Stream Refactoring 6.2 Performance

Last modified by Vincent Massol on 2024/11/19 16:13

 XWiki
 Requirements
 Dormant
 
 

[xwiki-devs] Activity Stream Logic http://markmail.org/thread/n4ncqeokl5ebyylm (Feb 13, 2012)

Description

Performance of the current (<=6.1) implementation

Algorithm used to display events

  • (1 query) Get pages that have events occured on them recently, grouped by day.
    • (P queries) For each page and day combination, get all the events occured in that day
      • (Ea <= Em queries) Remove duplicate events
        • squashing contiguous series of same events from the same user into a single one by displaying the last one in that series
        • skipping update events associated to higher events (e.g. addAttachment event is preceded by an update event and we don`t want to see both.)
                    Related events are checked for each non-consecutive update event (i.e. not in a chain of update events)
      • (Ea * aprox 3 queries) For each event in that page and day, display the event
        • Display the localized description of the event (added comment, created page, etc.) but also pass the number of related events. (e.g. "added 3 attachments")
                    $!services.localization.render("xe.activity.action.${event.type}", [$relatedEventsNo])

Where,

Pnr. of modified pages to display
Pmmax nr. of modified pages to display (entries parameter in AS macro, default 20)
Enr. of events for a page in a day
Emnr. of subentries to display for each page (subentries parameter in AS macro, default 10)
Eanr. of actual displayed events for a page (i.e. alternations in a series of events for a page), E > Ea <= Em

Estimations on performance

Nr. of queries in an AS display:
1 + P * (1 + Ea + Ea * 3)

Nr. of max queries, where P=Pm and Ea=Em:
1 + Pm * (1 + Em + Em * 3)

Nr. of max queries, in a busy wiki, using default values (Pm=20, Em=10) =
1 + 20 * (1 + 10 + 10 * 3) = 821

Busy wiki =  

  • at least 20 distinct pages being heavily worked on, not only necesarily in the same day (so it can also be the same page showing in 2+ days)
  • at least 10 alternating changes for each page in a day (thus, "heavily worked on" pages that result in displayed subentries in AS for that page)
    • this is alternating : create->edit(x times)->addComment(y times)->edit->addAnnotation->addAttachment(z times)-> etc.
    • this is not alternating: edit->edit->edit->etc.

Nr. of min queries, in a light wiki, where P=20 (default) and Ea=1 (i.e. one event per page)
1 + 20 * (1 + 1 + 1 * 3 ) = 101

Nr. of average queries, in an averagely collaborated wiki, where P=20 (=Pm, using default value) and Ea varies between [1-20]
(101 + 821) / 2 = 461

All of this is done just to display the {{activity/}} macro and are only the the number of queries related to events. There are also security queries, performed on each displayed page, and other UI related queries.

Cost of displaying event details

The display of event details is expensive because it requires work to be done at display time, instead of at storage time.

In the current implementation:

  • at storage time, for some events (e.g. addAttachment, addComment, etc.) and event identifier is also stored inside an event's parameter.
  • at display time, this identifier (usually a file name, object number, etc.) is used to identify and retrieve the entity affected by the event (such as attachment file, comment object, etc.). Once retrieved, the event's details are displayed.

One thing we could keep in mind when refactoring, is that all the information required at display time (different for each event type) can be computed and stored at storage time and kept in the event's fields (or parameters), much similar to what has been done for the page title (which is cached at storage time in the event.title field). This greatly reduces the load and the complexity of the display task.

Comparison on performance

Note: The parameters were tweaked so that each page would display the same time period at the time of the test, but the reference was the current implementation's 20 pages (items) parameter.

 Metric\ProtypePage Centred (Current)Page Centred (Experimental)User CentredTime Centred
 Execution time (ms) 800-1300 200-300 80-180 1100-1400
 Security checks (view right) 20 12 13 44
 Queries many (see previous sections) 1 1 2
 Parameters 20 pages 20 pages 40 events 20 groups, 10 min per group 
 Note Also displays events per page, including a lot of filtering and removal of low-level events which is costly. Events per page can be done in a separate AJAX call by expanding.  Events per group are available, but not very useful on the first display. Groups can be expanded to get events per group in an AJAX call. Performance varies a lot, depending on configured groups and group delay.

Some more details on each prototype can be found on the AcitvityStreamRefactoring62Focus page.


 

Get Connected