Creating a Word .docx from Outsystems

Creating a Word .docx from Outsystems

  
Hi all,

I'ver read some topics on this issue but I'm having a problem and would like to know if any of you has had it or nows out to correct it.

The way I'm creating the docx is having the original one somewhere and then, using the zip extension to access the XML files in the docx, change the "word/document.xml" and put everything into a new zip which I then download calling it something like "test.docx".

When I open it Word just gives an error saying "the content may have been corrupted do you want word to try and correct it?" (this is a free translation from the portuguese error so it may not be that accurate :) ), if I choose yes it opens the modified document with all the changes I've made even though it loses the name of the file.

Is there any way to correct this?

Thanks!
Hello Hermínio

The key is to identify where the corruption is located within the docx file. When you get the error, do you also get a description, or additional information regarding where the corruption occurs? Something like:

"word/document.xml", line 50, column 10

If you have that, then you only need to change the file's extension to .zip, decompress it, navigate to the .xml file within the directory tree, and access the line an column specified on the error. If you have an invalid character, you can track it back to the content you're changing on the espace, and correct it.

As an example, I've manually corrected docx files using this technique, due to invalid characters injected in runtime. Characters like 0xA0 (aka \n).

If you don't have this type of information, then tracking the corruption might be a little harder, but by systematically removing the injected content, you can narrow it don't to some invalid character, or element, or structure, on that xml.

Hope this information is helpful.

Let us know your findings.

Cheers

Miguel Simões João
Hi,

I wish the problem was that simple... the error in Office Word 2007 is as generic as it gets, even in the details. The altered XML I create has no visible problem, and even if I try to just replicate the original document, without alteration, it still gets corrupted.

I'm submiting my .oml so you can see for yourself, it uses the BinaryData and Zip extensions. You just simple choose a document and it changes any use of the word "Example" to "Working".

Thanks,
Hermínio Mira

My guess is that the docx itself contains some info like a CRC to check if the document is not altered.

Hi Hermínio

I've replicated the same behavior with the OML you've provided. Even without changing the XML content, the file got corrupted.

So I decompressed the generated file, and compress it again using the windows compression and 7-Zip tool. The file no longer got corrupted.

So I'm assuming that this problem could be related with the compression library used. In this case, you're using the Zip extension, which instead uses an old version of the SharpZipLib library. Investigating on the web, I could only find one report of an identical behavior, on http://www.generation-nt.com/us/answer/why-does-word-not-like-docx-file-help-36297322.html, which again, involved the SharpZipLib library, and only with MS Office 2007 SP2. I also can replicate with SP2. Do you have Office 2007 SP2 as well?

If you try to update the SharpZipLib assembly on the extension, you'll find that some property no longer exist. This is a clear indication that the version currently in use might be very old indeed.

I suggest you try updating the library yourself, to check if this problem persists. I'll try to get an update on this component as well.

Hope this information helps.

Cheers

Miguel Simões João

Hi Miguel,

In the meantime I came to the same conclusions as you and I'm already experimenting with some other .net libraries to see if it works. And I think I also found out the reason for this behavior, or better said to why it is happening...
When unzipping the docx you get a folder with all the files if you then zip that same folder the file will get corrupted and generate this error, if you enter the folder and  just zip the contents of the folder the file works perfectly!
So it seems the library used for the zip extension, for some reason, is creating this folder in memory and then zips it, which I think in other zip files will work normally just not in docx ones.

If we get this thing working I was thinking on then using it to create a new component with the same functionality of WordMerge (Input a docx and an excel with the list of original values and values to replace for then returning the modified docx), this one would have the big advantage of not even requiring microsoft office to be installed!

cheers,
Hermínio Mira

I have solved the problem and just decided to share with the community.

I'm uploading an example .oml and the necessary extension.

P.S: To make it work you'll also need the "hashtable" extension

Here is the extension mencioned in the previous post...

Hi Hermínio!

This is great stuff.

I think this would be a great thing to be shared in the Components section, so that more people will be able to access it.

Thanks for sharing it.

Best regards,

Paulo
Hi Paulo,

Happy to know it was usefull!
I was already thinking on publishing the component but was waiting for someone else to test it also. :)

Here is the link:
http://www.outsystems.com/NetworkSolutions/ProjectDetail.aspx?ProjectId=142

P.S: Changed the name to reflect more the reality of the extension.

Cheers,
Hermínio
Hi Paulo,

I tried to upload the extension but the integration studio give the following error message:

You are trying to Upload or Publish an Extension whose Intellectual Property is Protected, since it was created in a different Agile Platform Infrastructure than ''localhost''. To obtain the Intellectual Property Rights for using this Extension in ''localhost'' Infrastructure, please go to OutSystems' Intellectual Property Services at 'http://www.outsystems.com/ipp'.

I'm using Community Edition, it's only for higher edition?

Regards,

Miguel Pereira
Hi Miguel,

It just means you need to go to 'http://www.outsystems.com/ipp' and submit the extension, you'll then receive via email the extension again and you can publish it.

On this link there is an explanation to the reason behind this procedure.

Cheers,
Hermínio Mira
Hi Hermínio,

I submitted the extension 2 days ago but I didn´t receive the new extension to use, I will wait.

Best regards.

Miguel Pereira
Miguel,

check your spam box and if there is no email contact outsystems support @ support@outsystems.com

RNA.
Hi,

Thanks again, I published the extension DocxMerge, however when testing with a simple file teste.docx, the subsitution of a word, the action "WordReplaceTemplateKeys" exit on error and indicated that it does not find the library WindowsBase.dll:

Could not load file or assembly 'file: / / / ...................\admin\Bin2\WindowsBase.dll' or one of ITS dependencies. The system can not find the file specified.

Is necessary to add them anything else? in the Herminio example that I saw was He didn't use anything else.

Thanks,

Miguel
Hi,

I've been investigating the problem and it's seems that the library  WindowsBase.dll came with .Net Framework 3.5, Is that true?

Best Regards,

Miguel Pereira
Hi Miguel,

I'm pretty sure 'windowsbase.dll' is .Net framework 3.0 onwards. I might include it as a pre-requisite in the component... didn't think of it at the time!
I hope you get to use the component.

Cheers,
Hermínio Mira
Hi Hermínio,

After installing .Net framework 3.0, the component worked perfectly! 

But I have one last question (I hope), I'm testing the change of a word (docx) document template with the component, but the document has check boxes, there is some dynamic way, using the component, to select word check boxes?

Best Regards,

Miguel Pereira

Hi Miguel,

Glad it works now! But i'm sorry I can't help you with the check boxes...
If you run into any problems with the component let me know, because I already found a "feature" in the way Word creates the document XML that prevents the component from working correctly, but it can be bypassed.

Cheers,
Hermínio Mira

Hi all.

I'm using your extension. I have a template docx and I want replace one phrase.
After used the action "WordReplaceTemplateKeys" it turn a error "The specified package is invalid. The main part is missing".
You can see what I'm doing in the attached picture.

Can you help, please?

Thank you.
hi all,

I have the same problem that Cristina Barreira :S

Can you help, please?

Big thank you.
Hi Cristina and Sérgio,

Sorry for the late reply.

Apparently that error is caused by the document not being in the expected format. Read this post in Microsoft's forums.

Maybe you could attach a sample of the document you are trying to open, right before it is opened by the action you're calling?

Regards,

Paulo Tavares