Task #11950 (closed)
Opened 10 years ago
Closed 10 years ago
Coalesce duplicate events for processing at the DB level
Reported by: | jballanco-x | Owned by: | jballanco-x |
---|---|---|---|
Priority: | major | Milestone: | 5.0.3 |
Component: | General | Version: | 4.4.10 |
Keywords: | full-text indexing, search | Cc: | |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | n.a. |
Sprint: | n.a. |
Description
Currently, when indexing events the strategy is to load a batch of events, look for duplicates within the batch, uniquify the batch, process it, and then progress to the next batch.
Instead, we could craft a DB query that would select from events after a certain position, use distinct to remove duplicates, and return a batch-sized set of results. This would save processing in the indexer (the DB should be more efficient at creating a distinct set anyway), and extends the window for coalescing duplicates to all events newer than the last processed event (instead of only within the next batch).
Change History (4)
comment:1 Changed 10 years ago by jballanco-x
comment:2 Changed 10 years ago by jballanco-x
- Milestone changed from 5.0.1 to 5.0.2
comment:3 Changed 10 years ago by jballanco-x
Referencing ticket #11936 has changed sprint.
comment:4 Changed 10 years ago by jamoore
- Milestone changed from 5.0.4 to 5.0.3
- Resolution set to fixed
- Status changed from new to closed
This is handled via https://github.com/openmicroscopy/openmicroscopy/pull/2639
Referencing ticket #11936 has changed sprint.