Friday, March 30, 2012

Announcing bookshop: HTML to PDF & eBook Publishing



At long last we are excited to announce the release of bookshop 0.1.0. This is the first public release of bookshop.

The bookshop website.
The source code (scroll down for detailed installation notes and a tutorial for using bookshop)

What is bookshop?

Bookshop is a an open-source book development and publishing framework for authors, editors, publishers and coders in today's publishing industry. Bookshop provides best-practices for developing your books in HTML/CSS/JS, allowing them to be transformed into potentially any book format (Print-PDF, PDF, mobi, ePub, etc.).

A Book in HTML?

Yup. You heard us right. Your entire book is stored in HTML. We use CSS to describe the design and layout of your book. Then through the magic of bookshop (and some other tools) we turn your book into a print-ready pdf, an online viewable pdf, an iBooks compatible (ePub) book, a Kindle compatible (mobi) book, and even an html version you can view in the browser.

Pretty cool, huh?!

What's Wrong With the Status Quo?

Most publishing houses follow a pretty traditional process (toolchain) for taking an author's manuscript and turning into all the formats we love: the paperback book, the iBooks format, kindle, etc.

Generally the process goes something like this:
  1. Author turns in a manuscript.
  2. Publishing house turns it into 4 - 8 different formats: print-pdf, online-pdf, kindle, ibooks, etc.
  3. Errata (typo's) are received.
  4. Each format is re-edited by hand and re-released.
What's the harm in this toolchain?
  1. Redundancy: For every edit, there are potentially 8 documents that have to be touched.
  2. Human Error: With so many edits, the potential for error greatly increases.
  3. Lack of Scalability: As more authors and more editions are added, workload and complexity increase exponentially.
  4. Headcount Intensive: The more success a publishing company has, the more employees they have to throw at the toolchain to keep the process going.

What Makes bookshop So Different?

A Separation of Concerns

To begin, let's take a look at the basic parts that make up any book. 
  • Content
    • the actual words, grammar, parts-of-speech
  • Structure
    • preface, chapters, sections, etc.
  • Style
    • the layout, design, art, typeset
  • Format
    • the end result, client usable format, print-pdf, mobi, epub
We decided that it made more sense to abstract these four layers into their own components.

So lets take an example. Let's say that we have a book that needs to be developed and our first delivery is for the pdf that will be sent to the printer. For the sake of brevity we'll just work with the first paragraph of our first chapter.

Content

The content for books in bookshop is stored in HTML. So our sample chapter looks like this in its rawest form:

<h1>The Web and HTML</h1>
<p>Cascading Style Sheets (CSS) represent a major breakthrough in how Web-page designers work by expanding their ability to control the appearance of Web pages, which are the documents that people publish on the Web.</p>

Structure

Thanks to the Boom! microformat, we can then use basic HTML tags and CSS elements to describe the structure of the book. So in our first chapter's HTML we add a <div> with a "chapter" class to tell us that this section is a chapter.

<div class="chapter">
    <h1>The Web and HTML</h1>
    <p>Cascading Style Sheets (CSS) represent a major breakthrough in how Web-page designers work by expanding their ability to control the appearance of Web pages, which are the documents that people publish on the Web.</p> 
</div>

Style

Once we have our HTML Content and our Structure in place, we can use Cascading Style Sheets (CSS) to create the layout, look and feel of our book. The great thing about having this as a separate layer is that we can now create CSS style sheets for any layout that we want. So back to our example. We are designing a style sheet for a pdf that will be going to the printer. We need to specify the paper size as 6in X 9in, specify some margins, and then make sure that our chapters always begin on right-facing pages. So in our css we would put:

# stylesheet.pdf.css
@page {
  margin: 27mm 16mm 27mm 16mm;
  size: 6in 9in;
}
div.chapter { page-break-before: right;}

Format

Last, we will need to create the actual file that goes to the printer. With bookshop we simple type bookshop build pdf from the command-line and a printer-ready pdf is magically created for us. Well, it's not magic so much as a great tool called PrinceXML, a tool for creating pdf from html and css.

And there you have it!

Why Does This Make the World a Better Place?

Ok, so it may not make the world a better place, but it might make your publishing business a better place.

Benefits We Have Noticed 

  • Editing errata (typo's) happens in one place and only one-time, the HTML, instead of 8-10 different formats for each book
  • We can scale infinately. The only thing we have to add for a new format is a new stylesheet, the HTML source stays the same. This potentially means that one developer can manage a large portfolio of authors and books.
  • We can hire/train programmers who know HTML/CSS/JS instead of InDesign'ers, etc. And we don't have to fork out several thousand dollars for proprietary design software.
  • The ramp-up for potential authors/publishers is almost minimal since they can just copy and paste their own content into our example book, type "bookshop build [whatever]" and get their book in epub, pdf, html, or kindle
  • We can dramatically alter a book design/layout with just one stylesheet. So if we want to run with a completely different look & feel on the next edition of a book, we simply add another stylesheet.
  • We can incorporate best-practices in software development: agile release cycles, versioning control, testing (this may sound geekish, but it means a dramatic increase in efficiency and organization)

In Closing

Well, that's about it for now. We have much more work to do. Please note, this is only the 0.1.0 release which means that it is only basically functional. You can expect to experience a number of hiccups for now. But hey, it's open source which means that you can help out by doing any of the following:
  • Try it out and let us know what works for you and what doesn't
  • If you are a developer, please help us out by adding a feature, fixing a bug, or cleaning up our code
  • If you are a designer, create your own stylesheets and send them our way. We hope to develop a large arsenal of stylesheets to serve as templates for others.
  • If you have published a book with bookshop, let us know so we can list you in our directory