Polopoly and XML
March 24, 2008 in College Publisher, Websites
Being involved as the first school to beta test the new College Publisher 5.0, or Polopoly system, I have found there is a real need to get more feedback from other papers to see what they would like in the new system. I also hope this will be a good platform to keep people updated on what is new in the Polopoly system. #
I also want to make no promises for those grammarians out there. I am going for the updated content vs best written content. Sometimes you may get ramblings which unto themselves make no sense but hopefully in context of this blog will read coherently. #
Let’s get started… #
One of the new, and actual working features in Polopoly is XML upload of your paper. This is a huge time saver but takes some planning. #
What is XML? #
XML is a language which has been around since the 1970′s in one form or another. It is not like HTML where you have a ton of predefined coding. For example, type <p> in HTML and you get a paragraph return. XML allows you to create your own “tags”. It is also hardware and software independent. You don’t need DreamWeaver or any other special software to read or write XML, it is a basic text file. #
Here is a link to some common XML myths…
#
XML also, for the purpose of our discussions in taking InDesign or Quark content and repurposing it for the web, does not carry forward any sort of formatting. The same can be true if you went from a web to print workflow. XML just carries your “data” or in common newspapereese – your stories, headlines, subheadlines, bylines, etc. #
Great, so I can just export my content, right? #
Yeah, not exactly. That would be too easy. First, you have to get your content “tagged”. Simply put, you have to tell Indesign (and I’m assuming Quark but I don’t have it a copy so don’t quote me) what tags go with what parts of your paper. Since you can create your own XML tags you can have an XML tag called “story” you would use to label all your stories. The same would be true for headlines, subheadlines, bylines, etc. If you do not tag anything you will get an empty text file with just <root></root>. #
If you happen to be using a print workflow system like K4 I think you may have some XML information readily at hand. K4 is rather pricey so I don’t have a copy and would love to speak with anyone who knows a bit about how K4 works. At Boise State, we use InCopy with InDesign and I am researching how those two work together for some XML readiness. #
XML workflow… #
There are 3 basic ways to tag XML items in InDesign: #
1. Manually – open up your tags window, select a text box and click on a tag. If you don’t see any tag names, just create one. You can also drag and drop tags, though I found this to be more time consuming vs selecting a text box and clicking the tag name – drag and drop is more like archery, trying to hit the bullseye. #
2. Auto-taging – sounds great but really isn’t. Auto-tagging is like a stupid tag system and from my experience not conducive to a newspaper workflow/layout system. #
3. Mapping tags – you can map a tag to a character and/or paragraph style. I think this holds the most promise and could be an easy way to transition. This does take some thinking about your paragraph styles and usage but doesn’t require you to change or limit what you use. I still need to work with this and the Polopoly system to get the configuration correct. #
Once you have everything tagged, you export the file and it is now ready for uploading or ingesting to whatever XML aware system you have. Like tagging, this takes some effort but once done the first time shouldn’t be noticed afterwards provide you don’t change your tag names. #
Time #
As I see it, this all comes down to time. We currently use the cut and paste method to get stories from the printed version to online. For an average 12 broadsheet page issue, this process normally takes an hour and a half to upload the text and photos. The bulk of the time is spent copy and pasting text. #
I have done a few tests tagging a couple old issues and uploading them into Polopoly. My average time to tag all stories was around 10 minutes. Uploading took about 1 minute. My longer times were on my laptop (using a touchpad slowed me down). Having dual or big monitors helps. The more I can see on the screen while still having my palettes open cut down time. This is also not efficient as I was working around our current template. Note, once I uploaded I still had to go in and do some editing/checking. This did take time but changes to our template and the GUI for Polopoly should decrease the time spent just uploading content. #
Polopoly also has a “mass” image uploader as well. #
Around the corner? #
I think XML will be growing bigger and bigger in the publishing/media industry. It is already being discussed at that level and as college advisors we should be aware of it as well. #
I am hoping this coming fall we can move to a web first workflow system via Polopoly. An XML export FROM Polopoly into Indesign/Quark would be very helpful. Since it contains no formating information, I still maintain all my design options as well. Matter of fact, it could help cut down print design time by using the mapping function so everything comes in with your 95% of the time used defaults. You just drag the text boxes where you want them. #
Because of this, I am going to be looking at user management in Polopoly next. I am hoping to get feedback from all of you on what you would like to see. #
Brad, what newspaper are you with? Can we see your new Polopoly site?
I'm with The Arbiter at Boise State University. Our website is currently using the CP 4.0 system. We are beta testing the new Polopoly system, or CP 5.0. The site isn't live so really don't have anything I can show the world – yet.
Brad-
I think you hit the nail on the head with your web-to-print comments.
This is a major topic in the newspaper industry and while most of the larger operations will get this option in their K4, MediaSpan, Woodwing and various other editorial front-end systems, there are many paper both college and commercial that rely on a less complex word processing/page layout, file folder, method (us included).
IF College Publisher doesn't get this right, there are other open-source content management systems that may offer a better solution for little or no cost.
Drupal has a newspaper group that is currently trying to tackle this module:
http://groups.drupal.org/node/10012#comment-32130
Other open-source platforms such as Django and Joomla may be working or have web-to-print solutions as well.
I'm also optimistic that we're not too far away from having integrated off the shelf solutions for publishing to the Web and back, with software enhancements to InDesign and similar publishing products.
As next in line for the College Publisher upgrade, I'll be anxiously watching how your migration goes. Keep us posted.
-Harry
I admire the valuable information you offer in your articles. I will bookmark your blog and have my friends check up here often. I am sure they will learn lots of new things here than anybody else!
…
It can be overwhelming and frustrating at times. How do you choose ?the one? that works with the ?other ones? you will love and care for, and will in return inspire, bring value and joy.
…
I admire the valuable information you offer in your articles. I will bookmark your blog and have my friends check up here often. I am sure they will learn lots of new things here than anybody else!
…
It can be overwhelming and frustrating at times. How do you choose ?the one? that works with the ?other ones? you will love and care for, and will in return inspire, bring value and joy.
…
This inhibits you from using XLR microphones,etc,etc… unless you have a mixer hooked into your computer's line in.
However, today I’m going to take a few minutes to wish a swift and well-deserved death to a staple of news web sites: The left-side navigation back.
Will hang out below often. I wonder the way you were able to assemble this information They do make sense to some extent but in my standpoint, there is no hurt to learn and researching perhaps the things that one is by now knowledgeable.