AD

POI3.6 right word2007 Analysis Summary

Just contact XWPF, prior to finally touch the right HWPF Bijiao Shu, and in the POI right word2007 groping in the process only to find many problems. Useful information online for POI have been not many, for almost no XWPF here to do a little summary:

An analytic word2007 need 7 packages, poi, poi-scratchpad, poi-ooxml, poi-ooxml-schemas, xmlbeans, dom4j, geronimo-stax-api, so the lack of run-time security will be wrong, no evidence JDK version of the conflict.

2 XWPF seems to be self-contained, there is no inherited from the HWPF have some of the objects, do not know the original structure did not do a good job or a large difference between the two kinds of word file format, which for me is compatible with the original program needs to create multiple versions of the word a lot of trouble.

3 from the existing situation, XWPF seems to be the judge does not support isInTable check XWPFWordExtractor parsing getText code is actually parsing the text and then parsing forms, Kuang Yun, the authors have not found the location of positioning table table method? And the contents of the table seems to no longer stored in the paragraph in the

As XWPF using openxml parse xml, and to all similar classification labels are resolved, so it has been exported to text in order can not be distinguished, and in fact is the order stored in the docx, but resolved after the paragraphs and tables kept separately, but also lost the position on the relationship between the two, in the Extractor can only output, respectively, and the semi-finished A silent.

4 cell in, was able to support multiple paragraphs, but no longer try getParagraph methods, or only get a paragraph getParagraph, must be another way to read.

5 can be seen from the test programs, accessories (such as images) could be read out, the hyperlink should also be possible, there is no detailed view.

6 cell actually do not have access width, depressed, tables, many issues. It is not even have time to see getText no rush to do what we can see the completion of

    public String getText() {
        //TODO
        return null;
    }


7 advantages, the theory can already distinguish the amendment to add or delete the contents (although Extractor is still the same full output), but the same virtues and forms, add and delete the contents are classified read out, the same can not determine the order, So Extractor As a general, delete and insert the order of the output text, silent ...

    protected XWPFParagraph(CTP prgrph, XWPFDocument docRef) {
        this.paragraph = prgrph;
        this.document = docRef;

        if (!isEmpty()) {
            // All the runs to loop over
            // TODO - replace this with some sort of XPath expression
            // to directly find all the CTRs, in the right order
            ArrayList<CTR> rs = new ArrayList<CTR>();
            rs.addAll(Arrays.asList(paragraph.getRArray()));

            for (CTSdtRun sdt : paragraph.getSdtArray()) {
                CTSdtContentRun run = sdt.getSdtContent();
                rs.addAll(Arrays.asList(run.getRArray()));
            }
            for (CTRunTrackChange c : paragraph.getDelArray()) {
                rs.addAll(Arrays.asList(c.getRArray()));
            }

            for (CTRunTrackChange c : paragraph.getInsArray()) {
                rs.addAll(Arrays.asList(c.getRArray()));
            }

8 (2-5) today found that there is a ooxml parse bug, does not resolve the following labels

            <w:smartTag w:uri="urn:schemas-microsoft-com:office:smarttags" w:element="place">
              <w:r>
                <w:rPr>
                  <w:rFonts w:ascii=" Ms  _GB2312" w:eastAsia=" Ms  _GB2312" w:hAnsi=" Tahoma  " w:cs=" Tahoma  "/>
                  <w:b/>
                  <w:bCs/>
                  <w:kern w:val="0"/>
                  <w:szCs w:val="21"/>
                </w:rPr>
                <w:t>Para</w:t>
              </w:r>
            </w:smartTag>

smallTag labels do not know what the meaning is the word document itself is generated, from the word seen in the label content and other text styles there is no different, but ooxml does not understand this label, resulting in loss of text.

For now, feel XWPF done quite rough, are rare, there are only two examples of editing, the document is almost no, the document format is xml, even if their estimates do not have to parse the number of POI poor, not recommended.

9 macro substitution tags into the text directly to the content, leading to a lack of iconic signs (prefix and suffix), so that may be forced to replace the body mistakenly changed, but can not do anything of the. checkbox example, used <w:instrText xml:space="preserve"> tags, you can see how cumbersome word document.

            <w:r w:rsidRPr="00E7434E">
              <w:rPr>
                <w:rFonts w:ascii=" Tahoma  " w:hAnsi=" Tahoma  " w:cs=" Tahoma  "/>
                <w:kern w:val="0"/>
                <w:szCs w:val="21"/>
              </w:rPr>
              <w:instrText xml:space="preserve">MACROBUTTON UncheckIt</w:instrText>
            </w:r>
            <w:r w:rsidRPr="00E7434E">
              <w:rPr>
                <w:rFonts w:ascii=" Tahoma  " w:hAnsi="Wingdings" w:cs=" Tahoma  " w:hint="eastAsia"/>
                <w:kern w:val="0"/>
                <w:szCs w:val="20"/>
              </w:rPr>
              <w:sym w:font="Wingdings" w:char="F0FE"/>
            </w:r>
            <w:r w:rsidRPr="00E7434E">
              <w:rPr>
                <w:rFonts w:ascii=" Tahoma  " w:hAnsi=" Tahoma  " w:cs=" Tahoma  "/>
                <w:kern w:val="0"/>
                <w:szCs w:val="21"/>
              </w:rPr>
              <w:fldChar w:fldCharType="end"/>
            </w:r>
            <w:r w:rsidRPr="00E7434E">
              <w:rPr>
                <w:rFonts w:ascii=" Tahoma  " w:hAnsi=" Tahoma  " w:cs=" Tahoma  "/>
                <w:kern w:val="0"/>
                <w:szCs w:val="21"/>
              </w:rPr>
              <w:t xml:space="preserve"></w:t>
            </w:r>
            <w:r w:rsidRPr="00157D2C">
              <w:rPr>
                <w:rFonts w:hint="eastAsia"/>
              </w:rPr>
              <w:t> Fixed assets  </w:t>
            </w:r>

//************************************************ *********

XWPF a little more detailed description **hi.baidu**/zrzx/blog/item/dde3bc31b9e248a15fdf0e36.html

Costs a lot of effort finally unable to locate the form to add a POI location, and read the object properties of the problem, to celebrate a first.
标签: xml, relationship, good job, api, conflict, two kinds, run time, hyperlink, paragraphs, paragraph, file format, labels, dom4j, shu, openxml, docx, time security, test programs, word2007, ooxml
分类: Java
时间: 2010-03-29

相关文章

  1. androidw root file system analysis summary. pdf

    androidw root file system analysis summary. pdf
  2. Massive Data Processing and Analysis Summary

    Sources: http://www.winmag.com.cn/html/2006/12/20061231122146-1.shtml Massive data processing problems, proces ...
  3. CRM Software Implementation Process Analysis Summary (Welcome Paizhuan)

    In the current competitive market environment, customer resources were very important period. However, just re ...
  4. (Transfer) MapReduce source code analysis summary

    Original Address: http://www.cnblogs.com/end/archive/2011/04/26/2029499.html Transferred by Note: The summary ...
  5. Software Project needs analysis summary

    Requirements analysis is the basis for project development, the prison is not strong foundation, directly rela ...
  6. [Maintenance] mysql log analysis, summary

    mysql> show master logs; shows the number of binary logs mysql> show variables like 'log_bin'; confirm y ...
  7. Learning Linux-Linux scheduler Analysis Summary

    Fourth, Linux scheduler Analysis 1. Scheduler features Linux2.6 2.6 scheduling system designed from the ground ...
  8. Vs2005 XML Analysis Summary Report Designer

    0 - designed to set the default page. Rdlc The size of the initial report designer drawing board and some defa ...
  9. [Reprinted] WordPress principle of summary output

    http://www.renniaofei.com/design/wordpress-the-content-yuanli-fenxi/ In how to display the article summary Wor ...
  10. Spring Framework core source code analysis and experience-IOC articles -4

    From the article, I know we want to get the Bean is instantiated out how, but also know that dependency inject ...
  11. Lucene Learning Summary (Collection) Email

    One study concluded Lucene: the basic principle of full-text search The second study concluded Lucene: Lucene' ...
  12. Summary of 2010 blog: enterprise architecture. Agile individuals. Model-driven

    Day last year, I compiled a 2009 blog ( 2009 Years blog summary: OpenExpressApp, agile development, requiremen ...
  13. java jvm parameters-Xms-Xmx-Xmn-Xss tuning Summary

    java jvm parameters-Xms-Xmx-Xmn-Xss tuning Summary Common Configuration Example <br /> heap size setting ...
  14. Distinguish the type of needs analysis

    You may not understand the title "discernment needs analysis of the type" This is my personal feelin ...
  15. web development platform for route choice analysis -1

    Author Writing time Contact Explain Xiao-Ming Zhou 2010-8-3 [email protected] Welcome experienced several ...
  16. web development platform for route choice analysis -2

    3, development of language analysis 3.1, SAN data analysis Name Java . Net PHP Large-scale enterprise applicat ...
  17. A performance test summary

    The opportunity to do a performance test, mainly pre-research nature. Developers need to refer to the test bef ...
  18. IOCP model summary (rpm)

    IOCP (I / O Completion Port, I / O completion ports) is the best kind of performance I / O model. It is the ap ...
  19. Basic theory test

    Software engineering model Learn about the test and had to discuss the software engineering model, because the ...