Any body used the Structured File Import Tool with Word files lately?

Hi,


I am trying the Structured File Import Tool in combination with the Word Structured import tool but no joy so far.





Has anybody used this successfully lately?



The "Tools" and "System-Management_manual" explain how to do it.





-I can upload a word document with the Word Import Tool Converter into /tmp

-but the word file does not show up in the Structured File Import Tool



I am able to upload and imported structured HTML documents.



Any ideas how to make this work ? (and any experience in how well this works?)



thanks,



Matt

[quote]Hi,


I am trying the Structured File Import Tool in combination with the Word Structured import tool but no joy so far.





Has anybody used this successfully lately?



The "Tools" and "System-Management_manual" explain how to do it.





-I can upload a word document with the Word Import Tool Converter into /tmp

-but the word file does not show up in the Structured File Import Tool



I am able to upload and imported structured HTML documents.



Any ideas how to make this work ? (and any experience in how well this works?)



thanks,



Matt[/quote]



Hi again!



Oddly enough I was looking at it today.



There is an explanation in the source that might be useful:



Setting up the Windows Converter

--------------------------------



In order to run the Import Tools Word Converter correctly, there needs to be some sort

of Windows side process running, with Microsoft Office Installed. This is needed as

formatting Word Document conversions is not accurate in any current UNIX solution.


  1. Copying the batch files.



    In the directory that this file is located, there should be a number of .bat files and

    a filter.exe. file. These are used to control the Conversion Process. These need to be copied to a

    new directory on the windows system. To accomplish this:

    * Make a new directory anywhere on the windows system.

    * Copy over the entire contents of the windows_converter directory located at

    Matrix_Root/packages/import_tools/converters/import_tool_converter_word/windows_converter

    to the new directory made on the Windows Machine.


  2. Installing External Software



    This process uses an external conversion utility call ‘Convert Word To HTML’, available from:

    http://www.flash-utility.com/convert-word-to-html.html

    Download this piece of software, and install it onto your Windows server by running the downloaded file

    NB: Registration of this software is the responsibility of the user. Squiz is not affiliated with this

    software or its creator in any way.



    Once this software is installed, navigate to the directory where the program files were installed. In this

    directory, there will be a file called “doc2html.exe”. This file needs to be copied over to the converter base

    directory, that was setup in Step 1.


  3. Configuring the Converter.



    Now that all the files are on the windows machine, they need to be configured in order to run correctly.

    To perform this task, open up a text editor (Notepad, JEdit etc), and locate and open the file task.bat in the

    directory all the files were copied to.



    This file, task.bat, acts as a controller for the entire conversion process. It is important that this is configured

    correctly for the process to act appropriately.



    If the systems you are working on have a Samba (http://samba.org) share installed, the process will be a lot easier.



    Installing with Samba share:

    * In task.bat, locate the only uncommented line, which should be similar to:

    CALL dir_process.bat INPUT_DIR OUTPUT_DIR

    * Change the INPUT_DIR and OUTPUT_DIR directories to match those of the mapped Samba drives on the UNIX server.

    * Eg. If the Unix directory is set to g: , then the INPUT_DIR should be g:\IMPORT_WORD and g:\IMPORT_DIR,

    where IMPORT_WORD is the directory set in the Word Import Tool Converter Asset, and the IMPORT_DIR is the directory

    set in the Import Tools Manager



    Installing without a Samba Share:

    * The first step is to create 2 directories on the windows server. One for the word input files, and another for the

    completed files.

    * Now we need a way to get files over from the unix server to the windows server, and vice versa. This must be a command

    line application for the process to work seemlessly. If you can access your server using secure copy (scp), then this

    would be the preferred option. A good utility is PSCP (http://http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

    documentation at : http://www.flash-utility.com/convert-word-to-html.html.



    * Now that you have a way to transfer files to and from the UNIX server, you need to insert these commands into task.bat.

    There should be a line similar to:

    @REM **Pre-processing

    Immediately after this line, insert the command used to get the files from the UNIX server, to the input directory

    you created.

    * Next, insert a similar command to copy all completed files from the second directory you created, back over the the

    UNIX server. Insert this line directly after the line in task.bat that reads:

    @REM **Post-processing

    * The final task is to setup the local folders to process the word documents. There will be a line similar to:

    CALL dir_process.bat INPUT_DIR OUTPUT_DIR

    around the the middle of the file. Change this line so that INPUT_DIR, and OUTPUT_DIR point to the input files and

    completed directories respectively.



    Manually converting files:

    * If the uploading of documents is not time-critical, or if something is preventing you from transferring files between

    servers, you may choose to manually convert files on a regular basis by hand. To accomplish this, copy all the files

    from the word converters set directory on the UNIX server, to the input directory set on the Windows server. Run the

    file task.bat (or select run from the scheduled task interface) then copy all the files located in the output

    directory, to the import directory specified by the Import Tools Manager.


  4. Setting up a scheduled task



    In order for the converter to process files in a timely manner, it needs to be scheduled to run at a specified interval.

    The most common method of doing this is using a scheduled task. In Windows XP, scheduled tasks are located in

    Start->Programs->Accessories->System Tools->Scheduled Tasks. Create a new scheduled task, that runs the file 'task.bat',

    and set it to run as often as you want the files to be processed.


  5. If all of the above is setup and working correctly, then when you upload a file in the Matrix backend, using the Word Import



    Tool Converter asset, the file will be converted into an HTML file automatically, and placed in the directory specified in

    the Import Tools Manager Asset. Then, upon selecting the Structured File Import Tool, this new file will appear, and it can

    then be converted into Matrix assets.