Activity Stream Refactoring 6.2 Performance
- XWiki
- Requirements
- Dormant
[xwiki-devs] Activity Stream Logic http://markmail.org/thread/n4ncqeokl5ebyylm (Feb 13, 2012)
Description
Performance of the current (<=6.1) implementation
Algorithm used to display events
- (1 query) Get pages that have events occured on them recently, grouped by day.
- (P queries) For each page and day combination, get all the events occured in that day
- (Ea <= Em queries) Remove duplicate events
- squashing contiguous series of same events from the same user into a single one by displaying the last one in that series
- skipping update events associated to higher events (e.g. addAttachment event is preceded by an update event and we don`t want to see both.)
Related events are checked for each non-consecutive update event (i.e. not in a chain of update events)
- (Ea * aprox 3 queries) For each event in that page and day, display the event
- Display the localized description of the event (added comment, created page, etc.) but also pass the number of related events. (e.g. "added 3 attachments")
$!services.localization.render("xe.activity.action.${event.type}", [$relatedEventsNo])
- Display the localized description of the event (added comment, created page, etc.) but also pass the number of related events. (e.g. "added 3 attachments")
- (Ea <= Em queries) Remove duplicate events
- (P queries) For each page and day combination, get all the events occured in that day
Where,
| P | nr. of modified pages to display |
| Pm | max nr. of modified pages to display (entries parameter in AS macro, default 20) |
| E | nr. of events for a page in a day |
| Em | nr. of subentries to display for each page (subentries parameter in AS macro, default 10) |
| Ea | nr. of actual displayed events for a page (i.e. alternations in a series of events for a page), E > Ea <= Em |
Estimations on performance
Nr. of queries in an AS display:
1 + P * (1 + Ea + Ea * 3)
Nr. of max queries, where P=Pm and Ea=Em:
1 + Pm * (1 + Em + Em * 3)
Nr. of max queries, in a busy wiki, using default values (Pm=20, Em=10) =
1 + 20 * (1 + 10 + 10 * 3) = 821
Busy wiki =
- at least 20 distinct pages being heavily worked on, not only necesarily in the same day (so it can also be the same page showing in 2+ days)
- at least 10 alternating changes for each page in a day (thus, "heavily worked on" pages that result in displayed subentries in AS for that page)
- this is alternating : create->edit(x times)->addComment(y times)->edit->addAnnotation->addAttachment(z times)-> etc.
- this is not alternating: edit->edit->edit->etc.
Nr. of min queries, in a light wiki, where P=20 (default) and Ea=1 (i.e. one event per page)
1 + 20 * (1 + 1 + 1 * 3 ) = 101
Nr. of average queries, in an averagely collaborated wiki, where P=20 (=Pm, using default value) and Ea varies between [1-20]
(101 + 821) / 2 = 461
All of this is done just to display the {{activity/}} macro and are only the the number of queries related to events. There are also security queries, performed on each displayed page, and other UI related queries.
Cost of displaying event details
The display of event details is expensive because it requires work to be done at display time, instead of at storage time.
In the current implementation:
- at storage time, for some events (e.g. addAttachment, addComment, etc.) and event identifier is also stored inside an event's parameter.
- at display time, this identifier (usually a file name, object number, etc.) is used to identify and retrieve the entity affected by the event (such as attachment file, comment object, etc.). Once retrieved, the event's details are displayed.
One thing we could keep in mind when refactoring, is that all the information required at display time (different for each event type) can be computed and stored at storage time and kept in the event's fields (or parameters), much similar to what has been done for the page title (which is cached at storage time in the event.title field). This greatly reduces the load and the complexity of the display task.
Comparison on performance
Note: The parameters were tweaked so that each page would display the same time period at the time of the test, but the reference was the current implementation's 20 pages (items) parameter.
| Metric\Protype | Page Centred (Current) | Page Centred (Experimental) | User Centred | Time Centred | |
|---|---|---|---|---|---|
| Execution time (ms) | 800-1300 | 200-300 | 80-180 | 1100-1400 | |
| Security checks (view right) | 20 | 12 | 13 | 44 | |
| Queries | many (see previous sections) | 1 | 1 | 2 | |
| Parameters | 20 pages | 20 pages | 40 events | 20 groups, 10 min per group | |
| Note | Also displays events per page, including a lot of filtering and removal of low-level events which is costly. | Events per page can be done in a separate AJAX call by expanding. | Events per group are available, but not very useful on the first display. Groups can be expanded to get events per group in an AJAX call. Performance varies a lot, depending on configured groups and group delay. |
Some more details on each prototype can be found on the AcitvityStreamRefactoring62Focus page.
Eduard Moraru