Task #12041 (closed)
Opened 10 years ago
Closed 9 years ago
RFE: more intelligence in assigning FileAnnotation media type
Reported by: | mtbcarroll | Owned by: | |
---|---|---|---|
Priority: | major | Milestone: | 5.1.1 |
Component: | Attachments | Version: | n.a. |
Keywords: | n.a. | Cc: | python-team@…, java@… |
Resources: | n.a. | Referenced By: | n.a. |
References: | n.a. | Remaining Time: | n.a. |
Sprint: | n.a. |
Description
Both attached to an image through the webclient:
omero=# select name, mimetype, hash from originalfile where name like 'foo%'; name | mimetype | hash ---------+--------------------------+------------------------------------------ foo | application/octet-stream | 1d229271928d3f9e2bb0375bd6ce5db6c6d348d9 foo.txt | text/plain | 1d229271928d3f9e2bb0375bd6ce5db6c6d348d9 (2 rows)
We should be able to assign MIME type more intelligently than this. If nothing else, once we are using Java 7 we could probably use http://docs.oracle.com/javase/7/docs/api/java/nio/file/Files.html#probeContentType%28java.nio.file.Path%29
Change History (13)
comment:1 Changed 10 years ago by jamoore
- Cc python-team@… java@… added; omero-team@… removed
comment:2 Changed 10 years ago by mtbcarroll
I don't know where/how media types get assigned. Certainly on the pathway by which regular non-programmer users will be attaching files by Web or Insight, though. I don't know if it would work to let clients leave it as null and have the server guess for original files without a media type already set.
Should we wish it seems more reasonable to do less automatically for people writing scripts or their own clients or whatever, it would just be nice to default more nicely for people who aren't really thinking about this stuff.
comment:3 Changed 10 years ago by jamoore
Will / J-m: thoughts on the cost of this and what version it would be feasible for?
comment:4 Changed 10 years ago by wmoore
Seems like we can simply use http://docs.python.org/2/library/mimetypes.html#mimetypes.guess_type to set a mimetype based on file name? Should be pretty straight-forward.
comment:5 Changed 10 years ago by wmoore
- Milestone changed from Unscheduled to 5.0.1
- Owner changed from jamoore to wmoore
- Priority changed from minor to major
- Version 5.0.0 deleted
comment:6 Changed 10 years ago by mtbcarroll
Any chance of something content-aware like http://github.com/ahupp/python-magic even if extension-based guessing is still used as a fallback for when something better is for some reason unavailable? Judging by my query results in the description we already just guess on filename, though I don't know by what rules. (Hmm, I wonder what foo.text would be taken as being.)
Trying to get it right is useful partly because when the media type isn't properly recognized the attachment goes unindexed.
comment:7 Changed 10 years ago by jburel
- Milestone changed from 5.0.1 to 5.0.2
This one dropped of my radar. Pushing to 5.0.2 so we can replace the code in place in the Pojos by an external library.
Too late to put it now unless somebody thinks otherwise.
comment:8 Changed 10 years ago by mtbcarroll
Yes, good idea to push.
comment:9 Changed 10 years ago by wmoore
- Owner wmoore deleted
I'm pretty unclear about what needs to be done here (or where? - server-side?).
comment:10 Changed 10 years ago by mtbcarroll
I don't know where MIME type is actually assigned or for which code paths it is appropriate to do so based on file content, but if it is in Java land and somebody can point in the general direction of where it happens now then I could write some code, including if a client wants they could maybe leave it null if I know where to detect and fix that server-side.
comment:11 Changed 9 years ago by jamoore
- Milestone changed from 5.1.0-m4 to 5.1.1
Perhaps something to review in the lead up to 5.2.
comment:12 Changed 9 years ago by mtbcarroll
If we want to let the server do it, then might want to push to 5.2 if that allows assuming Java 7.
comment:13 Changed 9 years ago by jamoore
- Resolution set to duplicate
- Status changed from new to closed
Closing in favor of https://trello.com/c/v7nz8Qot/478-intelligent-mimetypes
Do you mean specifically in omero.gateway (Python), in both clients, or server-side?