Freebase
Start typing to get some suggestions
  • Data
  • Schema
  • Apps
  • Docs

        Discussions on File Format

        Start a New Discussion

        Discussion will be posted in:

        • File Format

        Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

        1.  

          Lots of non-file format things here

          also posted to
          • Computers,
          • Serialization,
          • sandos
          5 posts, latest post: anjackson, Aug 24, 2010
          Link to discussion
          1.  
            tfmorris Top Contributor Freebase Experts
            May 3, 2010
            tfmorris says:

            This type is being used as a kind of catch-all for everything from low-level encodings which aren't file formats at all to top-level container formats. As an example, mu-law, PCM, ADPCM, G.721, etc are all audio encodings or families of audio encodings which are always, at least these days, wrapped in some container format before being written to disk, so I don't consider them to be file formats at all.

            It feels messy, but perhaps the flexibility outweighs the value of finer grained modeling. What do others think?

            1.  
              anjackson
              Aug 13, 2010
              anjackson says:

              I also think this is a bit too messy. For example, many of the entries are in fact format families (e.g. PDF) and so do not provide an easy way of identifying different versions of the format (e.g. PDF 1.5 does not have a Freebase URI). Perhaps we could add a 'Bitstream Encoding' entity (which may be embedded in another, or may act as a stand-alone file) and then point the current Format entities at these. For example:

              PDF . has_version . [Bitstream Encoding: PDF 1.6]

              But perhaps this will turn things people think of as Formats into Encodings and they won't be able to find them?

            2.  
              tfmorris Top Contributor Freebase Experts
              Aug 13, 2010
              tfmorris says:

              Cross-posting to the domain so that this discussion is more visible.

            3.  
              tfmorris Top Contributor Freebase Experts
              Aug 13, 2010
              tfmorris says:

              sandos's type Serialization is probably relevant here to help distinguish format models from their serializations.

              Versions of formats should probably just be modeled as separate topics, particularly if there's any significant compatibility issues.

              The extended_from property can be used to link things together, but it's not always 100% accurate for families of things which are more related by branding or continuity than technical compatibility. Not sure how finely this needs to be modeled though...

            4.  
              anjackson
              Aug 24, 2010
              anjackson says:

              Sorry for the delay in replying - I didn't get any notification email, and I'm not sure why as I've allowed it in my profile.

              I agree that we should distinguish between models and their serialisations, although sandos's type still does not distinguish between specific encodings (e.g. XML rather than XML 1.0). That would imply three levels: Specification/Model, File Format, Serialisation/Encoding/Version.

              I'm not quite sure how to align the Model/Specification with the more common file formats. For RDF, there is a well-specified model and a range of encodings. For things like 'Images', the model is less well specified (rather, the core concepts are well-specified, but there are a lot of additional complications that vary between formats), so the mapping is not clear.

              I agree that version of formats should be separate topics (as I would like URIs for each), but I'm not sure if I would prefer different instances of the 'File Format' type, or a new 'Serialisation' or 'Encoding' type. I suspect that, for now, it might work better if we augment the File Format schema and use that for both file format families and specific encodings. We could add fields to distinguish them, and let things develop for a while before we attempt to prise them apart into separate entities.

              The semantics of 'extended_from' are not clear to me. It has no description in the File Format type, and does not indicate how it should be used. For me, it is critical to be able to distinguish between direct super/subset relationships of encodings (i.e. XHTML is also XML, but a more specific subset - technical compatibility if I understand you correctly) and other relationships (e.g. HTML5 is a later version of HTML4). Without clear guidance on what extended_from means, the data will become chaotic.

              I would really like to work out how to move this forward. It seems that the user who created the File Format type (superkurt) is long gone. How do we help make it better? Who should we talk to?

          Discussion is posted in:

          • close File Format
          • close Computers
          • close Serialization
          • close sandos

          Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

        2.  

          Please reciprocate properties

          1 post, latest post: tfmorris, Aug 13, 2010
          Link to discussion
          1.  
            tfmorris Top Contributor Freebase Experts
            Aug 13, 2010
            tfmorris says:

            Please reciprocate the created_by, written_by, and read_by properties onto Software Developer and Software respectively so that the information is visible from that end of things as well.

          Discussion is posted in:

          • close File Format

          Think this discussion also relates to something else? Cross-post it by adding a new discussion area:

        Search Discussions

        Related Discussions

        • Computers
        • Serialization
        • sandos
        ©2012  Metaweb
        • Page History
        • RDF
        • Feedback
        • Attribution Policy
        • Terms of Service
        • About Us
        • Jobs
        • Freebase Blog
        Freebase contains information on:
        • Arts & Entertainment
        • Commons
        • Products & Services
        • Science & Technology
        • Society
        • Special Interests
        • Sports
        • System
        • Time & Space
        • Transportation
        Dev Tools
        Refresh cache | Query Editor | Normal view | Explore | Explore2 | Admin view | View transaction log | Suggest transaction log | Client transaction log | hide (F8) | debug-level
        TID(s):
        Controller: 0.217s
        Template: 0.059s
        Cost: br=6.0, cc=0.337, ch=0.0, cm=0.0, cm+h=0.0, cr=0.0, cs=10.0, cw=5.0, dr=10861.0, dt=0.435, dw=0.0, gqr=0.0, in=4496.0, ir=19406.0, iw=0.0, lh=0.0, lm=1.0, lr=1.0, mcs=0.003, mcu=0.092, minflt=2, mr=5.0, nivcsw=17.0, nreqs=12.0, nvcsw=24.0, oublock=192.0, pf=0.0, pr=0.0, stime=0.004, te=0.031, tf=0.099, tg=0.084, tm=0.191, tr=0.024, ts=0.001, tu=0.022, utime=0.33, va=11794.0