Task #12027 (closed)
Opened 10 years ago
Closed 10 years ago
Bug: speed up 70K-file directories
Reported by: | jamoore | Owned by: | mlinkert |
---|---|---|---|
Priority: | critical | Milestone: | 5.0.1 |
Component: | Bio-Formats | Version: | 5.0.0 |
Keywords: | n.a. | Cc: | dpwrussell |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | n.a. |
Sprint: | n.a. |
Description
Helio's files (under /ome/team/dpwrussell) lead to exceptionally long import-candidate calculations (on the order of hours) most likely due to the number of files in on directory (nearly 75000). Methods like File.getLength take enormous amounts of time. There may not be much we can do to speed these up (especially over NFS) but perhaps we can detect that it's taking so much time and check less?
"main" prio=10 tid=0x00007f628800b000 nid=0x1ceb runnable [0x00007f628e88c000] java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.getBooleanAttributes0(Native Method) at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242) at java.io.File.isHidden(File.java:905) at ome.scifio.io.Location.isHidden(Location.java:640) at ome.scifio.io.Location.list(Location.java:398) at loci.common.Location.list(Location.java:253) at loci.formats.in.PerkinElmerReader.initFile(PerkinElmerReader.java:308) at loci.formats.FormatReader.setId(FormatReader.java:1360) at loci.formats.ImageReader.setId(ImageReader.java:781) at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:576) at loci.formats.ChannelFiller.setId(ChannelFiller.java:263) at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:576) at loci.formats.ChannelSeparator.setId(ChannelSeparator.java:274) at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:576) at loci.formats.Memoizer.setId(Memoizer.java:471) at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:576) at ome.formats.importer.ImportCandidates.singleFile(ImportCandidates.java:414) at ome.formats.importer.ImportCandidates.handleFile(ImportCandidates.java:595) at org.apache.commons.io.DirectoryWalker.walk(DirectoryWalker.java:367) at org.apache.commons.io.DirectoryWalker.walk(DirectoryWalker.java:335) at ome.formats.importer.ImportCandidates.execute(ImportCandidates.java:368) at ome.formats.importer.ImportCandidates.<init>(ImportCandidates.java:229) at ome.formats.importer.ImportCandidates.<init>(ImportCandidates.java:180) at ome.formats.importer.cli.CommandLineImporter.<init>(CommandLineImporter.java:111) at ome.formats.importer.cli.CommandLineImporter.main(CommandLineImporter.java:683)
Change History (3)
comment:1 Changed 10 years ago by jamoore
comment:2 Changed 10 years ago by mlinkert
Performance of File.isHidden() seems to be the underlying cause; removing calls to isHidden() brings the setId time down to 3 seconds over NFS. See #6586.
comment:3 Changed 10 years ago by mlinkert
- Resolution set to fixed
- Status changed from new to closed
Hopefully fixed with https://github.com/openmicroscopy/bioformats/pull/1003
More stack traces: