Posts Tagged With ‘XML&8217


The Evolution of Silicon Designer

We are today issuing a press release about Silicon Designer, two years after the first announcement. It has been a busy two years since we announced this as a product, yet it really has a much deeper history. Today’s release touts our recent success with multi-channel output, yet it is not that multi-channel is new to us at all, it is that we are finally doing it the right way from a single system. Let us go back to some of the early work that led to the organic productization of Silicon Designer in 2009 and the recent work that has brought us success with multi-channel, most notably email and tablet output from the same editor used for editing print documents. Finally, we’ll consider the two ways HTML5 is central to the future of Silicon Designer.

Silicon Designer

Silicon Designer

1998 was a great year. Back then, the Silicon Publishing founders were working at Bertelsmann Industry Services (now known as Arvato, a group of the Bertelsmann publishing conglomerate), in a San Francisco-based automated typesetting division (formerly Delta Electronic Publishing) that had been acquired by Bertelsmann earlier in the 90s, which had a rich 20-year history and codebase.

We were continuing the legacy work of automated typesetting, for example churning out hundreds of directories or catalogs from databases in a single pass using tools such as Xyvision. More exciting, we were leveraging the same data-generated-markup tactics that had been used for 20 years in typesetting, now to feed dynamic content to the web. We were also creating online composition applications. So the same clients that 10 years ago had hired BIS to automate the typesetting of healthcare provider directories for print could now have us build dynamic web applications from the same data (“find the Spanish-speaking OB/GYNs within 10 miles from your home”). When the list came back, we could dynamically create PDF output (“generate a personalized directory from your search results”).

Dynamic PDF circa 1998

Dynamic PDF circa 1998

We were good at dynamic web output very soon after the web came out. The reality was (and still is) that the fundamental coding central to dynamic typography is identical to the coding of web output. While there are unique characteristics to the web vs. print paradigms from a design perspective, yet the coding has more similarity than difference, and those of us with robust toolsets for dynamic markup in the mid-90s were able to do things on the web with more sophistication and flexibility than many who were new to the game. Our SGML background from working in the tech doc space didn’t hurt, either. After all, HTML was an SGML dialect.

Even back then, we were building online design tools, they were just crude, limited by the capabilities of browsers and rendition engines. Users would typically control content from HTML forms, from which they could either enter the content for a single document, or map data sources and define business rules to generate multiple documents, or single data-generated documents, from a web interface.

In 1999 we got wind of PGML, which soon evolved into SVG, and we experimented with online editing tools in SVG and VML. Javascript, CSS and Flash were by that time showing promise as well and giving more power to the UI of document creation sites, yet over subsequent years we were disappointed how slowly SVG gained support, especially the rejection of then-powerful Microsoft, and later the about-face by Adobe (that would be the first about-face, for those who are counting… recently they flip-flopped right back thanks to Mr. Jobs).

SVG Draw, an early demo showing the promise of SVG

SVG Draw, an early demo showing the promise of SVG

In 2000, we founded Silicon Publishing. The success with BIS had been too extreme, and at the peak of the dot com era things got just a bit crazy when our division was spun into a dot com unrelated to what it had been doing the previous 20 years.

We founded Silicon Publishing with no greater ambition than to continue to do what we had done at BIS… the most exciting part of which was online document generation.

Silicon Publishing was initially less capable at web-to-print than we had been at Bertelsmann. We could compose with PDF libraries, but we didn’t have a composition engine with real typography on the server such as Xyvision. We found that automating InDesign (even version 1.5) could produce stunning results offline, but it was not licensed for server. So our first years we did very separate web development from print automation in most cases. This was in some ways a step back, yet it let us focus on web and print in separate streams, as 95% of organizations still tend to do today.

We were passionate about SVG and hoped to use that as something of a web-to-print engine, yet it was an uphill battle to get either browser support for SVG from then-dominant Microsoft, or good print support from the SVG spec itself, from open source projects such as Batik and FOP, or from commercial vendors such as Adobe. In those days, Adobe was still squarely focused on the desktop. Still we prototyped online editing tools in SVG.

Our InDesign automation kept going, and we begged Adobe for an InDesign Server for years. They finally released InDesign Server in 2005 along with CS2. We were in the first group of InDesign Server resellers and we were able to take the batch composition work we’d done with desktop indesign and port it over to server.

From then on, Silicon Designer was inevitable. We got requests to build Flash or HTML front ends to InDesign Server, and we started building them one by one. By then we had totally mastered automating InDesign in terms of batch composition, yet there was some work to reconcile the graphic models of InDesign to either Flash or HTML, especially when it came to text.

Pre-Silicon Designer Online Editor

Pre-Silicon Designer Online Editor

BIS had instilled in us a religious belief in services. The charismatic leader of BIS in the 1990s had been a young guy named Ralf Bierfischer, and he would passionately declare that we did services, not products. I think this was instilled in my subconscious, I never considered productization until it became really obvious. “Hey, we have built the same exact application 15 times now, maybe we should consider building it as a product?”

By 2009 we had built many forms of web front end to InDesign Server, and with the Flex 4/Flash 10 text capabilities, we were really round-tripping Flash and InDesign extremely well. At this point we began building a product based on our years of experience with custom solutions. We were able to hire some amazing talent to do this, in part thanks to Adobe layoffs.

At that same time, a new form of back-end composition entered our lives: Scene7 Web to Print. While InDesign Server had the ultimate composition power one could ever dream of, it had not been built as a true server, and scaling it was (and remains) a fine art. It can be done; with enough instances/servers you can attain any level of throughput, yet the response time is not what one would write home about, even with optimization (by 2009 we mitigated this by letting all edits occur on the client and only using the server for back-end rendition for print). Scene7 was different: it started out as a very true, server-based product.

We started building Scene7 solutions and helping Adobe with the Scene7 product. We decided to have a second form of Silicon Designer, with a common front end but using Scene7 Web to Print instead of InDesign Server (or in some cases combining) on the back end. Scene7 had certain advantages over IDS in some cases: the updates to vector art were very quick, the cross-sell/up-sell capability great (because of the speed of preview); in general it is more server-like, but less finished in terms of pagination and typographic detail. I have explained the differences in fairly excruciating detail.

Silicon Designer implementations with Scene7 Web to Print Back Ends

Silicon Designer implementations with Scene7 Web to Print Back Ends

Still, this did not stop our work with InDesign Server at all. As I’ve explained, there are still tradeoffs, and our Silicon Designer product makes the one big Achilles heal of InDesign Server (rendition time) a non-issue, as we render everything in the client. We have gotten better and better at working with InDesign Server, and our Silicon Designer implementations are now about 50% Scene7, 50% InDesign Server. Hard to tell whether that will change over time.

InDesign Server Silicon Designer Implementations

InDesign Server Silicon Designer Implementations

We already had an XML model for documents that supported round-tripping InDesign and Flash, now we extended this same model to accommodate Scene7. 2010 was interesting, just when we had gotten Scene7 figured out, it became clear that the HTML/SVG work of our past was going to come back into relevance, thanks to the lack of momentum of Flash on mobile devices. HTML5 is still behind Flash in many ways, yet it has reached a level of maturity as a spec and support from WebKit, that we are adding it to the set of rendition translations from our core XML model.

If that were not enough work, we have recently been extending Silicon Designer into an email editor, and it really works well, far better than the direct HTML editor approach (along the lines of HTMLArea, FCKEdit, TinyMCE) that we used to take for such things. With HTML email, constraints are essential, and keeping the details of the HTML out of the hands of editors can make such a system bulletproof. It helps to work from a core model of rendition that is independent of the UI and the final output.

The future of Silicon Designer is HTML5-based in two ways… it is fast becoming an HTML5 editor, i.e. an editor that produces HTML5 as one form of output, while we are also building a version of Silicon Designer that itself is coded in HTML5. We don’t see Flash going away, it is pretty much a philosophy of “Flash where Flash goes, HTML5 where HTML5 goes” with common models for document rendition and other things that keep the document setup process and print rendition the same with the different UIs. Our core XML model for rendition is the reason for our success, and we have benefited from having to render diverse outputs over the years. It is a fun time for this technology.


DITA to take over the world…

DITA will take over the world… or maybe more like lay under it, as XML does currently.

From my perspective, DITA (or a good part of DITA – there is also the tech doc focus) is the next step in core SGML/XML. IBM started SGML itself, and later had a fair amount to do with XML: now the same sort of people are working on DITA, making XML safe for the world.

DITA extends SGML constructs such as entities with constructs such as conrefs. Everyone loves the idea of re-use of content, but XML 1.0 is a bit too flexible in this regard. It doesn’t say much about *how* you re-use, associate, and aggregate content, thus tools will do the same thing different ways, or won’t support re-use well at all. DITA fixes this, then immediately (concurrently) applies it to Tech Doc.

DITA is based on the practical experience of some IBM tech doc teams and while their goals and requirements were specific to tech doc, many of the core constructs are not.

Similar to XML itself, which is a meta-language (or language for creating languages), DITA has a powerful specialization methodology, that allows for completely custom document structures, yet a backwards compatibility with the core DITA constructs. If your <eBookPara> tag is read by a DITA rendition tool that only knows the <p> of DITA, you will at least get things rendered, though perhaps not in the special “eBook” way that you prefer. At least the tools don’t break.

It is somewhat confusing that the drivers for DITA remain squarely in the Tech Doc space, yet the solution it provides is often fairly universal. Maybe what DITA needs to do is split into the tech-doc specific DITA and the generic DITA, the way XSL split into XSLT and XSL-FO.


Adobe Learns XML, Slowly

I noticed that the draft FXG 2.0 Specification is finally online. It appears that this will be the form of FXG implemented in CS5.

I have been interested in, and somewhat connected to, Adobe’s approach to XML for quite some time. In the mid 1990s, FrameMaker supported SGML prior to the birth of XML. In 2000, Silicon Publishing worked with Adobe in publicizing FrameMaker 6.5 as an XML-capable tool, though FrameMaker+SGML only worked with XML in a very cumbersome, awkward way.

I will never forget our first project for Adobe, which was one of the very first Silicon Publishing projects. One Friday in 2000 I went to meet Doug Yagaloff, the publishing genius that led Caxton, and he gave me a copy of Frame+SGML and said I just had to do one simple thing. Import an XML document, and export it back out. That was a long weekend! I felt like I must be very stupid, it took me forever to get anywhere at all. Thankfully, Sunday night I found a “quick guide” online which exemplifies the great patience that is generally characteristic of those working with SGML and document-centric XML. I was able to show Doug an example the following Monday – “it’s hard, isn’t it?” he smiled.

We worked with Adobe on the FrameMaker 7.0 release, which dramatically improved the XML support. Later we put DITA support into FrameMaker for Adobe, which now gives a real head start when working with document-centric XML. I am a strong believer in DITA.

That is the core, semantic XML that SGML was oriented around at its foundation. InDesign got some bare-bones support for semantic XML with 2.0, but it goes nowhere as deep as the support of real XML authoring tools. Probably more interesting in terms of Adobe technology (they bought FrameMaker but it stands outside of their main product offerings) is rendition XML, and here was an area more exciting to us at Silicon Publishing.

Rendition XML

In 1998, when I was still at Bertelsmann, one of our former employees who had moved on to Adobe told me about a very exciting new XML specification: PGML. This made great sense to me, and I was an early enthusiast. It was not long before the PGML effort was subsumed under SVG, and Adobe was a major participant the SVG spec development effort, with their representative, Jon Ferraiolo, serving as lead editor of the spec itself. The Adobe SVG Viewer became the primary way SVG was viewed on the web, while tools like Batik evolved steadily and browsers (with one huge notable exception) gradually evolved support for it. Adobe Illustrator supported and still supports SVG round trip, while InDesign offered SVG export but has since deprecated it.

On another front, rendition of documents, Adobe also participated in the most significant standard: XSL-FO. Here was a document description language highly similar to FrameMaker’s MIF, and again an Adobe expert, Stephen Deach, led the specification definition. FrameMaker never directly supported XSL-FO, but a short-lived server application, Adobe Document Server, offered XSL-FO support via its underlying FrameMaker engine. This was a great XSL-FO implementation, actually, but was not well supported by Adobe and it is now extinct.

On the surface you could consider Adobe a leader in standards-based XML for graphic and document formats. However, as I discussed earlier, there is an interesting mix of motives in the involvement of such companies in web and XML standards. When Macromedia Flash was a competitor, an “open standard” like SVG made sense, but after the Macromedia acquisition, it made less sense.

Adobe has gone down the path of proprietary XML namespaces, not unlike their competitor Microsoft. And like Microsoft, whose XAML is highly derivative of SVG, they have not found a reason to re-invent the wheel.

Three XML Namespaces

There are three XML namespaces that appear critical to the future of document and graphic description at Adobe. These are IDML (InDesign Markup Language), FLA (formerly XFL, the description of Flash, and FXG (the graphic model supported by Flex 4 and central to the designer/developer workflow of Flash Catalyst): FLA handles the complete, interactive Flash model (literally replacing the binary .fla format) while FXG is more about static graphic representation. Theoretically, FXG is open source, as is the Flex SDK, but these remain extremely Adobe-centric efforts.

FXG and FLA have some strong similarities to SVG. In fact Adobe acknowledges the partially derivative nature in the specs. Of course there are differences between what was specified in SVG and what is natural to the graphic model underlying Flash; it appears that SVG would have been difficult to implement across the board, given how Flash was built and the goals of Flex yet they used SVG tags directly where it did fit the Flash model.

FXG is becoming a very powerful specification, now that the Text Layout Framework is built into it. Flash is able to render FXG and Illustrator is able to import/export FXG. With CS5 the designer/developer workflows and the general interaction between print-centric and web-centric work should become much better.

IDML is not derivative from XSL-FO. It represents a very general similarity, especially compared to the earlier INX XML format for InDesign: it is at least a complete document object model. INX was merely instructions to the scripting DOM as to how to create the document. It is too bad that Adobe has not managed to reconcile the text engine of InDesign with that of Flash: it appears that IDML will for the near term stay quite separate from the other Adobe namespaces.

To me, FLA is the most exciting new XML namespace coming from Adobe, but it won’t really be exciting until we have an FLA server that can compile FLA to SWF quickly. Dynamic content is possible with Flash in many ways already, but the possibility of making the entire SWF dynamic and manipulating that content in arbitrary ways with XML tools should bring the form of publishing power we envisioned with SVG to life once and for all.

As they have tended to miss the boat on any server application of their technology, Adobe appears to be slow to perceive the value of such a thing (I once asked Kevin Lynch for a Photoshop server – he questioned whether anyone would want it, citing experience with Macromedia Generator). It is an interesting question which group an XFL server may come out of; such a product could be conceived as natural to InDesign Server, to Flash Media Server (or some other work of the Flash Platform group), or to Scene7 (which has very powerful SaaS rendition capabilities, some of which are based on FXG). We are lobbying…

Adobe has finally built some XML foundation under their rendition models, and we are able to attain many of the things we dreamed of back in the SVG/XSL-FO days, via XML if not via open XML standards. I don’t have big hopes for Adobe integrating semantic XML in their core products (FrameMaker being a black sheep outlier), beyond simple metadata (XMP is good enough here, but document-level metadata is trivial compared to true semantic XML). Hopefully the power of their rendition technology with its new XML underpinnings (and consequent greater extensibility) will provide a foundation that enables other companies and open source efforts to make tools that bring the deeper vision of XML publishing to life.


The Two Perspectives on XML

I have been working with XML since it was a glimmer in the eye of Jon Bosak. In fact, before XML was conceived, there was SGML; going from SGML to XML represented a streamlining for the web, but at its core there was not much functional difference; in fact XML is a subset of SGML. The key concept of semantic markup is central to the core value of SGML/XML.

The two main perspectives I have seen are Document-centric XML and Data-centric XML. SGML initially appeared in support of document-centric work: managing all the technical documents or contracts of IBM or Boeing, for example. Charles Goldfarb has maintained that “SGML literally makes the infrastructure of modern society possible” and I think he’s right – hmm, should we blame him for the lengths to which humans have gone to destroy the earth?

The document-centric XML world is really a direct continuation of SGML. When XML came out as a standard in 1998, those of us working with document-centric XML became giddy with excitement, anticipating that the standards being proposed at the time (notably XML itself, XLink, XML Schema, RDF, XSL and pre-cursors to SVG) would finally facilitate tools that made publishing work for organizations that weren’t quite as big as IBM or the Department of Defense. The vision of a semantic web and ubiquitous XML multi-channel publishing, seemed to be growing a foundation in theories gaining critical mass, with apparent support of software companies. It appeared these vendors might actually adopt the standards of the committees they were sitting on. “Throw away Xyvision!” I told my boss at Bertelsmann, “this XSL-FO will completely revolutionize database publishing!”

We were sorely disappointed over the next five years. In the years before 1998 W3C standards seemed magical; concepts from the standards were implemented relatively quickly, without perfection but with steady progress: browser updates would reflect CSS and HTML advances; even Microsoft was shamed into some level of compliance. But the monopolistic tendencies of those on the standards committees, coupled with the academic approach of some of the standards committees, managed to make it less and less likely that a given standard would find a functional implementation.

And there was that other perspective – the data-centric side of things. For many reasons, XML was at the right place at the right time in terms of data management and information exchange. In fact, the very year that XML became a standard, it also became the dominant way that machines (servers) talked to each other around the world. Highly convenient for exchanging info, as firewalls would tend to block anything but text over http, while XML markup would allow any sort of specification for data structures, and validation tools would ensure no info was lost.

In 1998, when you asked a programming candidate “what do you know about XML?” only the document-centric people would know anything. By 2000, everyone doing any serious programming “knew” about XML. Trouble was, they typically knew about “XML” only in the much easier-to-use, irrelevant-to-publishing, sense.

And the standards now had to accommodate two crowds. The work of the W3C XML Schema Working Group, in particular, showed the disconnect. Should a schema be easily human readable? What was the primary purpose of Schema? Goals were not shared by the document- and data-centric sides, and data-centric won out, as they have tended to dominate the XML space ever since that time. RELAX NG came about as an alternative, and if you contrast RELAX NG with W3C Schema, you will see the contrast between the power of a few brilliant individuals aligned in purity of purpose and the impotence of a committee with questionable motives and conflicting goals. Concurrent with a decline in the altruism of committee participants was the huge advance of data-centric XML and the disproportionate representation of that perspective.

Ten years later, we find in the document-centric world that toolsets related to XML in a data sense – parsing, transforming, exchanging info – have made great leaps forward, but we are in many ways still stuck in the 1990s in terms of core authoring and publishing technologies. It is telling that descendants of the three great SGML authoring tools as of 1995 – FrameMaker+SGML, Arbortext Epic, and SoftQuad’s Author/Editor, are, lo and behold, the leading three XML authoring tools in 2009.

There have been some slow-paced advances in document-centric XML standards and tool chains as well, especially the single bright light out there for us, Darwin Information Typing Architecture (DITA) which came out of IBM like XML itself. Yet standards for rendition, XSL-FO and SVG especially, have not advanced along with core proprietary rendition technologies such as InDesign, Flash, or Silverlight, though all of these enjoy nicely copied underpinnings pillaged from the standards. More important, nothing has stepped in to replace the three core authoring tools: the “XML support” of Microsoft Word and Adobe InDesign, for example, do not approach the capabilities of a true XML authoring application. There are a proliferation of XML “editors” but most of the new ones are appropriate for editing a WSDL file or an XML message (the data-centric forms of XML), not a full-fledged document.

Meanwhile, on the data-centric front, XML has simply permeated every aspect of computing. There are XML data types in database systems, XML features in most programming languages, XML configuration files at the heart of most applications, and XML-based Web Services available in countless flavors.

Document-centric XML is simply a deep challenge that will take more time (and probably more of a commercial incentive) to tackle. For the time being, structured authoring managed the XML way is still implemented mainly by very large organizations: such an approach has “trickled down” from organizations the size of IBM to organizations the size of Adobe (which does, in fact, use DITA now), but there are not tool chains yet available that will bring it down much further. The failure of the W3C XML Schema Working Group to provide a functional specification supporting document-centric XML can hardly be underestimated.

As long as content is not easily authored in a semantically rich, structured fashion, the vision of the semantic web will remain an illusion. When and if document-centric XML gets more attention from standards bodies and software vendors, human communications will become far more efficient and effective.


Welcome

Welcome to the first post of my new blog. I am Max L. Dunn. While there are plenty of other Max Dunns out there (I am often mistaken for Max S. Dunn, for example), I’m the one who co-founded Silicon Publishing, a company devoted to publishing solutions, back in 2000. We automate data-generated publishing solutions, build graphic and layout software, and increasingly connect web and print publishing workflows. We’re immersed in Adobe technology (Adobe is both a partner and a client), most focused on Adobe InDesign Server and the connection of that composition engine to data, and the reconciliation of that technology with HTML5 and Adobe Flash.

My deep long-term interest is XML from a document-centric perspective. We put DITA into FrameMaker for Adobe back in Frame 7.2, after helping make 7.0 work with XML in the first place, and continued to help Adobe with DITA in Frame 8 and 9 as well. We also developed our own Frame/DITA plug-in with Leximation for those that are really serious about such things. At this point in time our semantic XML work doesn’t connect very directly to our Web/InDesign Server work; I expect one day it will. I co-wrote a chapter of the XML Handbook with Charles Goldfarb on WYSIWYG XML Authoring, and realizing this vision in the InDesign/Web world looks more attainable each year, slowly working its way onto the road map for our Silicon Designer product.

I am big on standards, in theory: I am owner of the SVG Developers’ Group, for example, and I have tons of experience with XSLT and XSL-FO. Yet the sad reality is that as of 2009 such standards are rarely used directly. Rather, we find such standards copied into proprietary “standards” by the large software companies that we still depend on for software that actually works.

In this blog, I’m going to be sharing opinions, information, and stories of my life in publishing technology. Posts will range from opinionated rants to factual explanations of how to tackle the challenges we face in our day-to-day work. I hope to be joined by some in Silicon Publishing in these writing efforts.

I’d like to hear from you. Help guide this blog by posting your own comments, resources, knowledge, and opinions. If you have questions about web-to-print technologies, Adobe tools, or XML standards, let me know. I’ll do my best to answer them here.