When I joined the Dewey editorial team over two years ago, planning and design efforts for a next-generation (fourth-generation, in fact) editorial support system (ESS) had already occupied the editors for a lengthy stretch of time, as evidenced by a blog entry on the subject from early 2006. The day when those efforts will finally come to fruition are fast approaching: we expect to be using the new system for our editorial work by July.
When the development effort had generated enough documents that we were all losing track of them, someone had the bright idea that we needed to develop a set of categories for organizing them. What an irony, I thought. Here we are devising a set of categories for organizing documents when we have the world’s pre-eminent classification scheme at our fingertips. Why not give the documents DDC class numbers?
The documents have long since been grouped into a few high-level categories, but giving them class numbers still appeals to me. As I survey the documents, I note that they fall into three overall areas: the data, the application, and the development process. Today we look at the data side of the new ESS, much of which falls within 005.7 Data in computer systems. Blog entries on the application and development process will be forthcoming in the weeks to come.
As previously blogged (here and here), an important aspect of the new ESS is that, instead of being maintained in a proprietary format, the data will be represented using the MARC classification and authority formats. (Records for schedule and table numbers, as well as Manual records, will use the classification format, while records for Relative Index headings and mapped headings will use the authority format.) Works on the MARC classification format are classed in 025.420285572 (built from 025.42 Classification, plus T1—0285 Computer applications, plus 572 from 005.72 Data representation, record formats, as instructed under T1—0285). Works on the MARC authority format are classed in 025.32220285572 (built as above, except that the base is 025.3222, the comprehensive number for authority files). Note that, although MARC is in a class-here note at 025.316 Machine-readable record formats, a scatter see reference there instructs that formats for a specific kind of record should be classed in the number for the kind, plus notation T1—0285572, as shown above.
Another data representation nicety in the new system is that diacritics and special characters will no longer be coded differently from all of our “standard” data. Everything’s Unicode, a topic in standing room at 005.722 Character sets. All we have to do is drag-and-drop from the Windows Character Map utility.
Of course, adopting a new data format means needing to convert the data from the old format to the new format. The including note at 005.72 Data preparation and representation specifically mentions conversion to machine-readable form (implicitly from non-machine-readable form), but is also the number for conversion from one machine-readable form to another.
Coming next: the features in our new ESS application.
Comments