Hi,
I am trying the Structured File Import Tool in combination with the Word Structured import tool but no joy so far.
Has anybody used this successfully lately?
The "Tools" and "System-Management_manual" explain how to do it.
-I can upload a word document with the Word Import Tool Converter into /tmp
-but the word file does not show up in the Structured File Import Tool
I am able to upload and imported structured HTML documents.
Any ideas how to make this work ? (and any experience in how well this works?)
thanks,
Matt
[quote]Hi,
I am trying the Structured File Import Tool in combination with the Word Structured import tool but no joy so far.
Has anybody used this successfully lately?
The "Tools" and "System-Management_manual" explain how to do it.
-I can upload a word document with the Word Import Tool Converter into /tmp
-but the word file does not show up in the Structured File Import Tool
I am able to upload and imported structured HTML documents.
Any ideas how to make this work ? (and any experience in how well this works?)
thanks,
Matt[/quote]
Hi again!
Oddly enough I was looking at it today.
There is an explanation in the source that might be useful:
Setting up the Windows Converter
--------------------------------
In order to run the Import Tools Word Converter correctly, there needs to be some sort
of Windows side process running, with Microsoft Office Installed. This is needed as
formatting Word Document conversions is not accurate in any current UNIX solution.
- Copying the batch files.
In the directory that this file is located, there should be a number of .bat files and
a filter.exe. file. These are used to control the Conversion Process. These need to be copied to a
new directory on the windows system. To accomplish this:
* Make a new directory anywhere on the windows system.
* Copy over the entire contents of the windows_converter directory located at
Matrix_Root/packages/import_tools/converters/import_tool_converter_word/windows_converter
to the new directory made on the Windows Machine.
- Installing External Software
This process uses an external conversion utility call ‘Convert Word To HTML’, available from:
http://www.flash-utility.com/convert-word-to-html.html
Download this piece of software, and install it onto your Windows server by running the downloaded file
NB: Registration of this software is the responsibility of the user. Squiz is not affiliated with this
software or its creator in any way.
Once this software is installed, navigate to the directory where the program files were installed. In this
directory, there will be a file called “doc2html.exe”. This file needs to be copied over to the converter base
directory, that was setup in Step 1.
- Configuring the Converter.
Now that all the files are on the windows machine, they need to be configured in order to run correctly.
To perform this task, open up a text editor (Notepad, JEdit etc), and locate and open the file task.bat in the
directory all the files were copied to.
This file, task.bat, acts as a controller for the entire conversion process. It is important that this is configured
correctly for the process to act appropriately.
If the systems you are working on have a Samba (http://samba.org) share installed, the process will be a lot easier.
Installing with Samba share:
* In task.bat, locate the only uncommented line, which should be similar to:
CALL dir_process.bat INPUT_DIR OUTPUT_DIR
* Change the INPUT_DIR and OUTPUT_DIR directories to match those of the mapped Samba drives on the UNIX server.
* Eg. If the Unix directory is set to g: , then the INPUT_DIR should be g:\IMPORT_WORD and g:\IMPORT_DIR,
where IMPORT_WORD is the directory set in the Word Import Tool Converter Asset, and the IMPORT_DIR is the directory
set in the Import Tools Manager
Installing without a Samba Share:
* The first step is to create 2 directories on the windows server. One for the word input files, and another for the
completed files.
* Now we need a way to get files over from the unix server to the windows server, and vice versa. This must be a command
line application for the process to work seemlessly. If you can access your server using secure copy (scp), then this
would be the preferred option. A good utility is PSCP (http://http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html
documentation at : http://www.flash-utility.com/convert-word-to-html.html.
* Now that you have a way to transfer files to and from the UNIX server, you need to insert these commands into task.bat.
There should be a line similar to:
@REM **Pre-processing
Immediately after this line, insert the command used to get the files from the UNIX server, to the input directory
you created.
* Next, insert a similar command to copy all completed files from the second directory you created, back over the the
UNIX server. Insert this line directly after the line in task.bat that reads:
@REM **Post-processing
* The final task is to setup the local folders to process the word documents. There will be a line similar to:
CALL dir_process.bat INPUT_DIR OUTPUT_DIR
around the the middle of the file. Change this line so that INPUT_DIR, and OUTPUT_DIR point to the input files and
completed directories respectively.
Manually converting files:
* If the uploading of documents is not time-critical, or if something is preventing you from transferring files between
servers, you may choose to manually convert files on a regular basis by hand. To accomplish this, copy all the files
from the word converters set directory on the UNIX server, to the input directory set on the Windows server. Run the
file task.bat (or select run from the scheduled task interface) then copy all the files located in the output
directory, to the import directory specified by the Import Tools Manager.
- Setting up a scheduled task
In order for the converter to process files in a timely manner, it needs to be scheduled to run at a specified interval.
The most common method of doing this is using a scheduled task. In Windows XP, scheduled tasks are located in
Start->Programs->Accessories->System Tools->Scheduled Tasks. Create a new scheduled task, that runs the file 'task.bat',
and set it to run as often as you want the files to be processed.
- If all of the above is setup and working correctly, then when you upload a file in the Matrix backend, using the Word Import
Tool Converter asset, the file will be converted into an HTML file automatically, and placed in the directory specified in
the Import Tools Manager Asset. Then, upon selecting the Structured File Import Tool, this new file will appear, and it can
then be converted into Matrix assets.