Contents:
Copyright (C) 1998,1999,2000 Data Exchange Associates, Inc.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Contact information:
Data Exchange Associates, Inc.
230 Burnt Meadow Road
Groton, MA 01450-1539
http://www.dexa.com
support@dexa.com
FormFiler is normally free of charge, and is distributed under the GNU general public license. This means that individuals and organizations, both commercial and non-commercial, may use FormFiler without charge. If you modify and redistribute it, there are certain obligations you must fulfill. See the file COPYING which came with your FormFiler distribution for the full license.
FormFiler is not guaranteed to be supported other than by
making full source code available. If it breaks you get to keep both
pieces. However, Data Exchange Associates, Inc. (DEX) sometimes
answers questions. The more concise and better researched your
question, the more likely it is to get an answer. No tutorials will be
provided. You have to learn this stuff on your own.
If
you desire expert help installing or configuring FormFiler, or enhancments
DEX can provide support at our prevailing hourly consulting rate.
The FormFiler script allows the webdesigner to gather input from an HTML form and have that input collected in a file. Many applications permit the developer to import delimited ASCII text files. Application packages such as MicroSoft EXCEL, various databases, mailing lists... FormFiler is ideal for situations where user input must be collected and then later processed.
The FormFile is know to work properly under the following environments:
Operation System: UNIX, Linux
CGI: perl version 5.004 or better
Webservers: Apache 1.3
Note: FormFiler should work on webservers other than Apache and MicroSoft Windows NT, but this has not been tested.
Version 3.1 is now GPL'ed.
Version 3.1 adds the capability to do additional processing on data received from a FORM. You may give FormFiler a list of additional programs, shell or perl scripts etc. with which to process a FORMs data. The only requirement is that such programs be capable of reading thier input from standard input (stdin).
FormFiler.cgi reads the browsers input stream containing all the data from an input FORM and appends the forms data to a webdesigner specified capture file. FormFiler can then either return a simple success message to the users browser, or the user can be forwarded to another URL. FormFiler gathers data in two different ways. If you include an "extractlist" in your FORM those, and only those, fields are extracted to the capture file as a delimited record, field names are NOT included. If, however, you do not include an "extractlist" FormFiler will extract ALL fields to the capture file and each field will be indentified with it's name as well as it's value as an "=" delimited name/value pair, e.g. fieldName=ThisMyValue.
In order to install FormFiler you need "telnet" and FTP access to the host website. If your ISP does not provide telnet access you may unpack the archive on your local system and perform the changes shown below there, and then ftp the results to your website. For the purposes of this discussion we will describe the "telnet" method. Begin the installation by using FTP to upload the FormFiler archive, e.g. FormFiler.tar, to your website. Once this step is completed begin the installation by initiating a telnet session and logging on to you website account.
1. Move the FormFiler archive to a temporary directory. Example:
$ mv FormFiler.tar cgi-temp
2. Upack the archive. Example:
$ tar -xvf FormFiler.tar
3. Set the file access permissions on the scripts FormFiler.cgi, FormHarvest.cgi to "executable". Example:
$ chmod a+x FormFiler.cgi FormHarvest.cgi
4. If you perl interpreter is NOT "/usr/bin/perl" you must edit the scripts to reflect the location of your perl interpreter. You can find the perl interpreter with the "which" command. Example:
$ which perl
/bin/perl
Above the system responds that perl is located in "/bin/perl". If perl is NOT located in "/usr/bin/perl" use a programmers text editor and replace the first line of: FormFiler.cgi and FormHarvest.cgi with the result from the "which" command above. Example your perl is in "/bin/perl". Use a programmers text editor and change the first line of FormFiler.cgi from "#!/usr/bin/perl" to "#!/bin/perl".
5. You may configure FormFiler by changing variables in the FormFiler.pm file. In most cases it is unlikely that these values need to be changed.
$FS - Field Separator. The delimiter character used by FormFiler to separate fields in the output file. The default is the ASCII comma character. NOTE: You should always choose a field delimiter character that will not appear in any input.
$RS - Record Separator. The delimiter character used by FormFiler to separate records within the output file. The default is newline.
$Message - The default success message. This contains a friendly message sent back to the user upon completion by FormFiler. If you use the hidden INPUT field "jumpto" then the FormFiler will direct the users browser to the URL specified in the "jumpto" value.
6. Next you must configure the FormHarvest.pm file. This file contains values used by FormHarvest.cgi. The following variables must be set. The default values are probably OK, but you should check with your ISP to be sure.
$mailer - This variable is the full path the the mail transport agent (MTA) used on your system. The default is '/usr/sbin/sendmail' this may be different on other systems. Check with your system administrator or ISP.
$mailerOpts - A string containing the options passed to the MTA defined in $mailer. The default is '-t' which is appropriate for "sendmail". The purpose of the '-t' switch is to force "sendmail" to parse the input for various mail header fields, this is absolutely necessary for the proper operation of FormHarvest.cgi.
$admin - This string is the email address of the person responsible for maintaining this site. The default is set to this to the webmaster for this site. You may prefer a different recipent.
7. Since only one client process may update the target data file at a time FormFiler provides a simple file locking mechanism. Two variables in both FormFiler.pm and FormHarvest.pm control access to the target data file. You can configure FormFiler.pm and FormHarvest.pm separately. You should leave these unchanged until it is proven that you are experiencing an excessive number of lock refusal errors from either FormFiler.cgi or FormHarvest.cgi.
$wait - The number of seconds a client process will wait between attempts to acquire a lock on the target data file. The default is one (1) second. If you are experiencing lock refusals you should probably increase the number of "retries".
$retries - The maximum number of attempts a client process will make to acquire a lock on the target data file before giving up. The default value is five (5).
@tasks - This is a list of all the additional processing commands you want to run against the FORM data after the form has been processed. Each entry is separated by a comma, and maybe enclosed in quotes. Each command may also have command line arguments. By default this option is undefined. Example:
@tasks = ( '/home/myhome/bin/myfilter arg1 arg2', '/usr/bin/cat' );
8. Once you are satisfied you may move the FormFiler package files to the directory where your server will find and execute them. For example you wish to put FormFiler in your cgi-bin directory.
$ mv FormFiler.cgi FormFiler.pl FormFiler.pm FormHarvest.cgi FormHarvest.pm ~/cgi-bin
9. Move the sample file and success page to your web document directory. Example:
$ mv SampleForm.html successpage.html ~/public_html
This moves the pages to the directory /home/USERID/public_html. This is the most likely location for your webpages.
10. Move the README and documentation files to the directory where you keep your document files. Example you have a directory call doc specifically for program documentation.
$ mv README FormFiler.html ~/doc
This move the documentation files to /home/USERID/doc. You can then access this document with your webbrowser.
1. Creating an HTML FORM for use with FormFiler is extremely simple. Simply create an HTML FORM document as you normally would. Give each INPUT or TEXTAREA element a name property. The name for a field should be meaningful for example "firstName" for the INPUT box to hold a clients first name would be a good choice.
2. Once the FORM looks satisfactory add the ACTION property to the FORM's start tag. This property contains the URL the webserver uses to determine which script to invoke. Normally this will be the /cgi-bin path on your website. As an example the FORM tag might look like this:
<FORM NAME="MailForm" ACTION="/cgi-bin/FormFiler.cgi?../formdata/mailform.db" METHOD="POST" ENCTYPE="x-www-form-urlencoding">
Here the ACTION property tells the webserver to invoke the URL /cgi-bin/FormFiler.cgi. The remainder of the URL ?../formdata/mailform.db is an argument passed directly to FormFiler, this argument tells FormFiler the location and filename to be used to save the FORM data. You may select any location and file provided they are "write" accessible to FormFiler.
Note: FormFiler.cgi always adds new data to the end of the specified
data file. It is therefore necessary to "harvest" the data file regularly
and start a new empty data file. This can be done by simply giving
the data file an new name, or copying it to a different file and deleteing
the original data file. FormFiler will simply start a new file.
FormFiler has two methods of selecting and storing FORM data, "all"and "specified" .
The "all" method is the simplest. Simply do nothing, and FormFiler will extract all the fields and their values from the FORM. The data will then be written to the supplied output data file. However, browsers make no guarantee as to the ordering of the fields, and indeed the same brower on a different platfrom may do things in a different order. Further the CGI script interpreter perl make no promise concerning the ordering of data within it's hash structure. In order that some intelligible data be written to the data file when using the "all" method FormFiler writes each field as a name/value pair separated by the ASCII equal (=). Each field is then delimited with the $FS delimiter. As an example if you've used the "all" method your data file might look something like this:
fieldOne=value of field one,fieldTwo=value of field two,fieldThree=value of field threeThe other method is the "specified" method. With this method you embed a "hidden" field within the FORM named "extractlist" . The "extractlist" specifies a list of the fields that you want extracted. The ordering of the "extractlist" is the order in which the fields will be written to the data file. In this case FormFiler does NOT write the field name to the data file only the data values are written. It is assumed that you know the name and order from the "extractlist" hidden field. For example you have a FORM with several INPUT fields, but you are only interested in three "Name", "Email", and "Tel". Simply embed the hidden field "extractlist" into your FORM as shown below:
<INPUT TYPE="HIDDEN" NAME="extractlist" VALUE="Name, Email, Tel">Now when FormFiler executes it will find the "extractlist" and place the values of only those fields into the output data file. In this case the data record might look something like this:
John E. Smith,jes@some.system.com,555-555-5555Note: to see a fully worked example of this examine the sample FORM "SampleForm.html" supplied with this package.
Required Fields
Many times you will find it necessary to be sure that the client has
entered a specific subset of fields. FormFiler provides a means to
list the "required" fields within a FORM. FormFiler
will enforce this requirement and return to the client a list of the "required"
fields that where not found in the clients FORM input., no further processing
will take place. Similar to the "extractlist" is the
"required" list. The "required" lists
all the fields within the current FORM that must be supplied by the client.
As an example you have a FORM that collects the clients fields:Name, Street,
City, State, ZipCode, Phone, Email, and Hobbies. At a minimum you
wish to be sure that you get the clients Name, Street, City, State,
ZipCode and Email. You can instruct FormFiler to check for those
fields before proceeding. This is accomplished by including a
"hidden" INPUT in your FORM with the name "required".
Here's an example:
<INPUT NAME="required" TYPE="hidden" VALUE="Name, Street, City, State, ZipCode,Email">The above will cause FormFiler to scan the clients input for the fields listed in the VALUE of the "required" INPUT field.
FormFiler allows you to either send a simple response message to the client on success, or to direct the client to another URL entirely. By default FormFiler responds with the simple message . This message is defined in the configuration file "FormFiler.pm"
$Message = 'Success: Your input has been received. Thank You.';You may use a Programmers Editor to change this message.
Alternately you can instruct FormFiler to direct the clients browser
to another URL. This is accomplished by embedding a hidden field
named "jumpto" into your FORM that directs the browser to a new
URL. The VALUE of the "jumpto" field contains the URL you
want the clients browser directed to. For example you have a fancier
page than our simple $Message text to reward the clients eyes and
you'd like the clients browser to show that page. Embed a hidden
field something like:
<INPUT TYPE="HIDDEN" NAME="jumpto" VALUE="http://www.mysite.com/eyeCandy.html">Note: you can examine the sample page "SampleForm.html" included with this package.
Included in the FormFiler package is the CGI script FormHarvest.cgi. FormHarvest.cgi allows you to create a webpage to collect (harvest) the data gathered by FormFiler. FormHarvest.cgi packages the gathered data and emails it to a designated address. Once FormHarvest.cgi has emailed the contents of the data file the original file is renamed. The new name will have today's data appended to its name. This will cause FormFiler to start a new data file. You must configure FormHarvest.pm before proceeding. Also you should restrict access to the webpage that harvests your data (see the Password Protection section).
Use your favorite Editor, or HTML page composer for this operation. Create a simple HTML FORM to allow your data file to be sent to its destination (a simple sample file "getdata.html" is included as a example).
FormHarvest.cgi recognizes and uses the following fields within a FORM: to, harvest, from, and subject. The fields "to" and "harvest" are required, the "form" and "subject" fields are optional. Each field and its purpose is discussed below.
The "to" field is used to specify the email address of the recipient
for the harvested data, The "to" field is REQUIRED.
Example:
<INPUT NAME="to" TYPE="HIDDEN" VALUE="somebody@some.place.com">Security Note: it is NOT recommended that this field be user accessible as this would potentially allow your data to be mailed to any address on the INTERNET.
The "harvest" Field (required)
The "harvest" field specifies the path
and name of the file to be mailed to the recipient. The "harvest"
field is REQUIRED.
Example:
<INPUT NAME="harvest" TYPE="HIDDEN" VALUE="/home/USERID/data/mydata.txt">Security Note: it is NOT recommended that this field be user accessible as this would potentially allow your data to be mailed to any address on the INTERNET.
The "from" field allows you to specify
the email address from which the harvested data will appear to come.
The default for this field is the address of the website adminstrator as
defined in the admin variable of the FormHarvest.pm
file. Example:
<INPUT NAME="from" TYPE="HIDDEN" VALUE="ralph@my.site.com">The "subject" Field (optional)
The "subject" field allows you to specify the "Subject:" line for the email message sent by FormHarvest.cgi. If this field is not defined in your FORM FormHarvest.cgi will create a "Subject:" line composed of the data file and the today's date. Example:
<INPUT NAME="subject" TYPE="HIDDEN" VALUE="Here is the data!">
Additional Processing of FORM Data
FormFiler provides a mechanism that allows you to do additional processing
on input received form a FORM.
After FormFiler has completed the initial processing of your FORM
the @tasks lists is examined, if the list is defined
the commands in it are executed.
Cosmically speaking you can use any executable program, shell script, perl script, etc. to process your FORMs input. There are ,however, some rules that such executables must follow.
Rules for executables:
1. The command must read its input from "standard input" (stdin).
2. The input is supplied in url-encoded form just as it would be received
from a web browser.
3. Any output written to "standard output" (stdout) is sent to the
web browser.
4. Any output written to "standard error" (stderr) is written to the
webservers error log.
If the executable you choose, or write adheres to these rules you may use it.
Note for developers: you may direct the output of the file descriptors stdout and stderr to other files.
FormFiler requires "write" access to the directory and data files it creates and writes. This can be accomplished by setting "world" write permissions on those directories and files. Example: you want FormFiler to write it's files in the "formdata" directory
$ chmod o+w formdata
The above grants all users on your system and across the web permission to write into the formdata directory, but does not grant persmission to read. As you can guess this is not a good idea and we do NOT recommend it. We suggest that you ask your ISP if you can use "cgiwrap" or for Apache sites "suEXEC". Either of these facilities will allow your scripts to be run as user ID of the script owner. This allows you to be more restrictive and remove all world "read/write/execute" permissions on all your FormFiler databases and directories. The utility "cgiwrap" is freely available.
Password Protecting Your Webpages
Access to the webpages that harvest your data files should be restricted only to those you authorize. Authentication is available through the HTTP 1.1 Authentication mechanism. This section gives a brief description on how to setup a password protected directory on your webserver.
Policies and configurations vary widely between webservers. The following description applies specifically to the Apache webserver. You should consult your ISP or webmaster for complete details concerning authentication configuration. If your ISP does not offer authentication look for one that does. The following description is intended by way of an example only.
Security Hint: Your harvest webpages should
reside in a directory separate from your publicly available webpages. This
will allow you to
control and authorize access to the harvest form(s).
Step 1.
From your "Document Root Directory" create a new subdirectory which will hold your harvest forms. Example:
$ mkdir private
Makes a new subdirectory called "private".
Step 2. Change to the new subdirectory.
$ cd private
Step 3. Create a password file add an authorization.
$ htpasswd -c .users ralph
Adding password for ralph.
New password: *******
Re-type new password: *******
$
This creates the password file ".users" and adds the authorization ID "ralph" and a password.
Step 4. The next step is to enable your webservers access control. This is accomplished by creating an access control file ".htaccess" in the directory which will contain your access options. Use a programmers text editor to create the file ".htaccess". Enter the following:
AuthType Basic
AuthName "Private Area"
AuthUserFile /home/ralph/public_html/private/.users
require valid-user
AuthType Basic enables basic password authentication. AuthName "Private Area" is a simple name to identify this area of your websever. AuthUserFile /home/ralph/public_html/private/.users tells the webserver where to find the file containing valid ID's. The option require valid-user tells the webserver to only permit access to this directory to those ID's contianed in the AuthUserFile. For example, if you wish to restrict access to this directory only to "ralph," use this option: require user ralph. This allows only ralph access to this directory. Again, for specific detailed information regarding security features on your site, consult your webmaster, ISP or webserver documentation.
Once you have completed this any webpages you place in the "private" directory will require a password before they can be accessed.
After installation you should have the following files:
/home/USERID/cgi-bin/FormFiler.cgi
/home/USERID/cgi-bin/FormFiler.pl
/home/USERID/cgi-bin/FormFiler.pm
/home/USERID/cgi-bin/FormHarvest.cgi
/homeUSERID/cgi-bin/FormHarvest.pm
/home/USERID/cgi-bin/cgi-lib.pl
/home/USERID/public_html/SampleForm.html
/home/USERID/public_html/successform.html
/home/USERID/public_html/getdata.html
/home/USERID/cgi-bin/README
/home/USERID/cgi-bin/FormFiler.html
The FormFiler(tm) package includes several files. Each one is briefly described below.
FormFiler.cgi The CGI script that does the work.
FormFiler.pl The library of functions used by FormFiler.cgi.
FormFiler.pm The configuration file used by FormFiler.cgi.
FormHarvest.cgi The CGI script that email data files.
FormHarvest.pm The configuration file used by FormHarvest.cgi
SampleForm.html A sample FORM demonstrating the use of FormFiler.cgi.
successpage.html A sample alternate response page.
getdata.html A sample retrieval FORM.
cgi-lib.pl A public domain CGI utility library used by FormFiler.
README a simple read me file.
FormFiler.html Documentation (this file) for FormFiler.
1. The webserver complains that it cannot execute "FormFiler.cgi". This is because the webserver usually runs as an unprivileged user.
$ chmod +x FormFiler.cgi
Solution 2 (not recommended but works): You must change the persmission on the offending directory to allow "world" "write" permission. For example you've configured FormFiler to write the data files in the directory"../mydata"
$ chmod o+w mydata
If you are having difficulties and you or your webmaster/ISP are unable to solve them please send email to Data Exchange Associates, Inc.
Copyright © 1998 by Data Exchange
Associates, Inc. All rights reserved.