2007년 1월 7일 일요일

2007년 1월 6일 토요일

2007년 1월 2일 화요일

Exploring the XML File Formats

[MS Open XML]

http://msdn2.microsoft.com/en-us/office/aa905362.aspx

Article

  • How to: Manipulate Office Open XML Formats Documents

    Learn about the components that are included in a formatted file and about several scenarios that show the versatility of these files.

  • Introducing the Microsoft Office (2007) Open XML File Formats

    Easily exchange data between Office applications and enterprise systems

  • DOWNLOAD

  • 2007 Office System Sample: Manipulating Office Open XML Format Files

    Download a set of sample files and Visual Studio project files that illustrate how to manipulate Office Open XML format documents programmatically.

  • Office Open XML Document Interchange Specification

    Based on a submission from Microsoft, this Standard defines Office Open XML's vocabularies and document representation and packaging for the "Office 12" versions of Word, Excel, and PowerPoint.

  • Open XML File Format Code Snippets for Visual Studio 2005

    Download and install these snippets to your Visual Studio code snippet folder, and then use them when customizing Excel 2007, PowerPoint 2007, and Word 2007.

  • Video
  • Generating Office Documents using the New Open XML File Formats

    Generate, read and modify documents without going through the object model of the hosting Office application.

  • Building Documents from Scratch Using the Office XML File Formats
  • Brian Jones gives an 8-minute demo around the Open XML formats, including key things that go into building a document.

  • Open XML File Formats

    Mauricio Ordonez, Doug Mahugh, Kevin Boske and Brian Jones discuss the Office Open XML file formats.

  •  

     

    -----------------------------------

     

    [참조] http://chilco.textdrive.com/~dmahugh/2006/01/09/exploring-the-xml-file-formats/

     

     

    This is a continuation of the XML File Formats Overview post …


    In a previous post, we covered the basic concepts behind the new Office Open XML file formats. Now let’s look at an example in a little more detail, to get a feel for how the new file formats work.

    We’ll just create a new Word document and type “Hello Word!” into it. We’ll change the font to 24pt bold Segoe UI, to see how some of the basic formatting is handled, and then we’ll save the file as HelloWorld.docx.


    Next, let’s rename that file and change the extension from DOCX to ZIP. Note that the icon associated with the file changes — Windows now sees it as a ZIP archive instead of a Word document. And since it’s a ZIP file now, we can explore its contents with WinZip, WinRAR, PKZIP, or any other ZIP compression tool.

    So if we double-click the HelloWorld.zip file, we can see the contents. In our example file the contents look like this:

    You can see that the document contains three folders, and something called [Content_Types.xml]. The “word” folder contains the actual content of the document, so let’s drill down into that folder. Here’s what it contains:


    Again, we’re going to just focus on the actual content of the document, and that’s contained in the document.xml file. Click the thumbnail image to the right to take a look, then click your Back button to come back to this page.

    Note that you can change the contents of the file by editing the XML file directly — you don’t need Word to do this! As an extreme example, you could send a Word or Excel document created in Office 12 to somebody working on the original IBM PC (running PC-DOS from 1981), and as long as they have PKZIP installed they could edit the file. More commonly, if you have WinZIP or PKZIP installed, you can drag a copy of document.xml to your desktop, edit it with Notepad, then drop it back into the ZIP container, change its extension from ZIP to DOCX, and you’ve made your change to the file.

    Want to explore an Office 12 document yourself, but you’re not on the beta program? No problem, here’s a copy of the HelloWorld.zip file used for this little example.

    This was a very simple demo, just to show the basic concepts. When we get back to this topic (it may be a while, I’m travelling a lot this month), we’ll look at how the new file formats handle embedded pictures and other binary objects. As with this example, the details are surprisingly simple and straightforward after you know the underlying architecture of the XML file formats.

    One Response to “Exploring the XML File Formats”

    1. OpenXML Developer : Learning about Open XML on-line Says:

      […] Learning about Open XML on-line Open XML is a new standard. So new, in fact, that the schemas are still being edited and haven’t been published by Ecma yet. And there are no books out on Open XML development, although that will surely change in the next year. So for now, the best place to learn about Open XML is on-line. This site will be a growing repository of information, and there is also some great information on blogs already. Here are some links to useful posts for Open XML developers … First, for .NET developers, you can use the new WinFX packaging API to read/write/create Open XML documents. Kevin Boske has a post on his blog about “Getting Started with Office Open XML and WinFX” that provides a straightforward overview of what you’ll need and how to get started. If you don’t have WinFX, you can get the February CTP here. Kevin also has some other posts of interest to Open XML developers: “Deleting a part” provides an example of removing the VBA project from any Office Open XML file. This approach can easily be generalized to removing any component from the Open XML package. Includes source code, and some interesting dialog in the comments. Here’s a a link to a code snippet that includes the updated ECMA namespaces. “How to create documents programmatically” is a high-level summary of the issues and options in creating an Open XML document from scratch in your own application. In learning the Open XML formats, most developers start with word processing documents. They’re probably the easiest to understand, and certainly the most widely used in real-world applications. Brian Jones has a great post entitled Introduction to Word documents that covers the basics. He provides a detailed look at how DOCX files store all the pieces that make up a typical Word document: styles, bullets and numbering, font information, document setings, story content, tables, custom-defined XML, sections, and headers/footers. This is worth a very careful read if you’re writing code that modifes or creates Word documents. The discussion in the comments also covers some good points. Another area of great interest for developers is Open XML’s support for custom schemas. You can define highly customized schemas for your particular domain or application, and integrate those schemas into documents so that your code can use your own semantics to describe or access the contents of those documents. Brian has three good posts on this topic: “Custom Defined Schemas” covers the basics of how to use custom schemas in Open XML. “Create a rich Word document based on your own custom XML” includes a sample ZIP file that provides a great example of how content controls can be bound to a custom XML schema./p> “Integrating with business data: Store custom XML in the Office XML formats” is a high-level overview of how Office 2007 will support the use of custom schemas. Brian’s blog is the most comprehensive source of Open XML technical details on the web to date. These posts also provide useful information for Open XML developers: “Inclusion of alternate formats” discusses how to include other documents, such as a PDF representation, in an Open XML document. Some of this doesn’t work yet with Office 2007 Beta 1, but Brian explains where it’s all headed. In “Answer to question on package relationships” Brian answers a reader’s question about the thinking behind the design of the package relationships within an Office Open XML document. “Why Office has moved to XML formats” is interesting background information for those who might wonder why Microsoft Office is moving from proprietary binary formats to Open XML file formats in the next release, Office 2007. Finally, if you’re at the “what the heck is Open XML?” phase of learning about this topic, the “Exploring the XML File Formats” post on my blog covers the basic concepts without getting into the technical details. Published Sunday, March 19, 2006 4:40 PM by dmahugh Filed Under: Open Packaging Convention, WordProcessingML, SpreadsheetML, PresentationML, .NET (C#, VB, J#, C++/CLI), Java […]