CGI Post Method
by Ronald Weidner [2005.06.01]

First Things First

Getting POST data from a simple form submission is well... simple. The first thing to understand is that unlike GET data, POST data is streamed into your CGI program via standard in . So, if you want POST data, you'll need to read it in when you start your CGI. Here is a little test example of how to do it.

#!/usr/local/euphoria/bin/exu
include "get.e"
puts(1, "Content-type: text/html\n\n")
with trace
object char
sequence post_var
sequence temp
atom post_len
post_var = ""

-- the server will tell us how many chars
-- the POST data is by setting an environment
-- variable named CONTENT_LENGTH
if atom(getenv("CONTENT_LENGTH")) then -- we expect a sequence, an atom is an error
    post_len = 0 -- if we have an error, set post_len to 0
else
    temp = value(getenv("CONTENT_LENGTH")) --get the value of CONTENT_LENGTH
    post_len = temp[2] -- Assign value to post_len
end if
if post_len > 0 then -- now we'll fetch the data
    for i = 1 to post_len do
        char = getc(0)
        post_var = post_var & char -- assign data to post_var one char at a time
    end for
end if

-- Now we will generate a simple form for data transfer
puts (1, "<html>")
puts(1, "<head>")
puts (1, "<title>Euphoria Post Example</title>")
puts(1, "</head>")
puts(1, "<body>")
puts(1, "<h1>Euphoria CGI Post Example</h1>")

-- if values were POSTed
if length(post_var) > 0 then
    puts(1, "<div><h3>Post Vars</h3><p>")
    puts(1, post_var)
    puts(1, "</p></div>")
end if

-- Create a form
puts (1, "<form method='post'>")
puts (1, "<input name=textBox1 /><br />")
puts (1, "<input type='submit' name='submit1' value='Submit' /><br />")
puts (1, "<form>")

puts(1, "</body>")
puts(1, "</html>")

-- All done

Once you run this little program you will see that we can indeed read the POST data from <stdin>. Once you've tried using the form a few times, you'll notice that the data received isn't always shown as it was typed. That's because by the time your CGI program read what was POSTed its value had been hex encoded. The reason for that is some chars, especially %'s, #'s and &'s will confuse the server if not dealt with. As a side effect, is becomes our problem on how to deal with this. That will be the topic of my next writing.

Cleaning Up the Data in the POST Stream

As the last section mentioned, what we read in from isn't always what was typed when the HTML form was sent. Non-alpha characters are now represented as hex values and spaces are now +'s. (If your browser is standards compliant, spaces become +'s otherwise spaces become %20, a hex representation.) So how do we fix the data? Well we parse it of course. The following function is a simple parser that evaluates what was sent and restores the data to its original form.

-- find %XX and replace it with the ascii
-- value. Also replace [+] with a [space]
global function decode_client_data(sequence str_data)
-- declarations
atom data_len -- length of str_data
atom str_index -- current index of str_data
atom char -- char at current index
sequence hex_code -- str represintaion of a hex value
sequence decoded -- str_data with hex and + fixed
sequence temp -- scratch pad var
-- init vars
decoded = ""
hex_code = { '#',0,0 }
str_index = 1
data_len = length(str_data)
-- parse str_data
while str_index < data_len + 1 do
-- mozilla and friends send [+] for [space]
-- all browsers prepend hex values with a [%]
if equal(str_data[str_index], '+') then
char = ' ' -- we'll be needing a [space] to replace the[+]
elsif equal(str_data[str_index], '%') then
-- the next 2 char after the % symbol is the hex value
-- of the character we need.
str_index += 1
hex_code[2] = str_data[str_index]
str_index += 1
hex_code[3] = str_data[str_index]
temp = value(hex_code) --get the value of the hex_code
char = temp[2] --store the numeric part in the char var
else
char = str_data[str_index] -- no event so just echo
end if
decoded = append(decoded, char) -- reallocate mem
str_index += 1 -- increment the loop counter
end while
return decoded -- the decoded value
end function

Now that we know how to get the data back to normal, the question becomes when do we make that transformation. Well my opinion of when to do this is after the POST string is parsed into a Name/Value hash. That way, the data can easily be parsed according to the protocol already set up. The next tutorial will demonstrate how to write an event driven parser to do just that.

Part III

Creating a hash of name value pairs from the post data is very easy. The text contained in the POST string is already formatted in such a way that parsing is very easy. The string is delimited by 2 characters, an & and an =. The & divides each name/value pair, and the = divides the name value. For example:

dude=John&age=25

Here in this example we have two pairs separated by the &. Each pair has a name/value separated by the =.

What we want to do separate this into a structure that is easier to work with. Here is what the end goal will look like...

hash_array = {{"dude", "John"}{"age", "25"}}

The following function is an event based, state managed parser. This one is simple but the concept is very important to future web development applications. This type of parser could be used for spiders, link checkers, and screen scrapers so it might be advantageous to see how it works now while the details are still simple.

function hash_request(sequence str_data)
atom str_index --the current index in str_data
atom data_len --the len of str_data
sequence state --current state of the parser
sequence hash_array --an array of hash structs
sequence hash --struct of name/value pairs
sequence name --the name of a get/post var
sequence val --the value of a get/post var
state = "NAME" --set the default state
hash_array = {}
hash = { "", "" }
name = ""
val = ""
data_len = length(str_data)
str_index = 1
-- start parser
while str_index < data_len + 1 do
-- if an & is found, save what we got and
-- get ready for a new name/value pair
if equal(str_data[str_index], '&') then
hash[NAME] = name --NAME is a constant of 1
hash[VALUE] = val --VALUE is a constant of 2
hash_array = append(hash_array, hash) --add hash to array
-- reset and get ready for next
state = "NAME"
name = ""
val = ""
-- if an = is found change state to VALUE
-- because the next chars will be the value
-- of the name/value pair
elsif equal(str_data[str_index], '=') then
state = "VALUE"
else
-- append the correct sequence depending on
-- what state the parser is in.
if equal(state, "NAME") then
name = append(name, str_data[str_index])
else
val = append(val, str_data[str_index])
end if
end if
str_index += 1
end while
-- since the str_data will not end with an & we need
-- create a hash and append it to the array based on
-- current values of name and val. hash[NAME] = name
hash[VALUE] = val
hash_array = append(hash_array, hash)
return hash_array
end function

Now that we have this data parsed into a name value hash, you might be wondering what was the value in that. To illustrate, lets look at a search function that can search these kinds of hashes for a particular value name.The idea is ask for the name of the variable and get the value.

global function request_post(sequence name)
atom i -- current index of array
sequence val -- this is what we're looking for
i = 1
val = ""
while i < length(POST) + 1 do
if equal(POST[i][NAME], name) then --POST is a global var
val = POST[i][VALUE] --NAME and VAlUE are constants 1 and 2
exit
end if
i += 1
end while
return decode_client_data(val) -- decode_client_data(val) fixes hex encodes
end function

We are getting close to having a fully functional CGI library. We need to add a few more things to make it really useful. The first of which, and the topic of my next writing, is a function that will read a text file and output it verbatim. This will make some coding tasks easier and less cumbersome. Then we'll explore using cookies.

Part IV

Now, let's discuss two more common functions of CGI programming. Often, it is just easier to write straight HTML than it is to keep writing all those puts(1, "some string") statements. But if you just start writing HTML in the middle of your Euphoria program, the interpreter will simply fail. So what we need in this case is a function that will read a file containing the HTML and send it out to stdout. Here is such a procedure.

-- Procedure to open a text file, read it line
-- by line, and send it to stdout
global procedure send_text(sequence file_name)
atom fp -- file id
object str_line -- the line of text as read from the file
fp = open(file_name, "r") -- open the file
if fp = -1 then
puts(1, "Text file " & file_name & " not found.") -- if ERROR send a message and quit else
while 1 do
str_line = gets(fp) -- read a line from the file
if atom(str_line) then -- if EOF, quit
exit
else
puts(1, str_line) -- send line to stdout
end if
end while
end if
close(fp) -- clean up and exit
end procedure

Ok, that was easy enough. But what if you want to read a text file, examine it or parse it, and then decide whether or not you want send it. With the technique above, you can't do any of those things but, if you stored the text in a buffer you could. Here is an example of how something like that might work.

global function buffer_fill(sequence file_name)
atom fp -- file id
sequence buffer -- the buffer to be filled with text
object str_line -- the line of text as read from a file
buffer = {} -- initialize buffer
fp = open(file_name, "r") -- open the file for reading
if fp = -1 then
buffer = append(buffer, -1) -- report an error
buffer = append(buffer, "Text file " & file_name & " not found.") --error msg
else
while 1 do
str_line = gets(fp) -- get a line of text from the file.
if atom(str_line) then -- if EOF, exit now
exit
else
buffer = append(buffer, str_line) -- append buffer with the new text
end if
end while
end if
close(fp) -- clean up and exit
return buffer
end function

Session Management with Cookies

Ok, I've saved the best for last in this series. Cookies. The big bad C word in web programming. Nothing strikes more fear in web surfers than the fear of the dreaded cookie. (Except perhaps the evil JavaScript) I'll stop the sarcasm now in favor of providing some real information about cookies. The first thing you need to consider about cookies is that they may not work. Many web surfers have found that they can turn off that feature and have made the choice to do so. Bottom line, have a back-up plan if you are going to use cookies.

Cookies are small bits of data stored in plain text on the client computer. The cool thing about cookies is that they can survive long after you script has finished running. You can use them to store information that can be persistent between visits. For example, you could use cookies so that the next time your visitor revisits, he is automatically brought back to the last page he viewed. Sometimes cookies are used to store visitor preferences, or maintain a "logged in" state.

So, how are cookies sent to the CGI script? Well, that's done in a very similar fashion as the GET variables, via an environment variable. The web server, before running your script, sets an environment variable named HTTP_COOKE and sets its value to a list of ; separated name/value pairs. What this means to us is that we must make a little change to our request_hash function in order to take into account the new delimiter. Also, some web clients like to put in an extra space before the name of the cookie variable. This causes a little trouble for us when searching for a value by name. To over come this I wrote a short and underpowered ltrim() function to remove leading spaces. That way when I store the name part of the cookie hash, it won't have leading spaces.

Except for the two minor changes mentioned above, the request_hash function is the same as it was before. Below is the new function to initialize the COOKIE variable.

procedure cookie_init()
sequence hash
hash = {}
if sequence(getenv("HTTP_COOKIE")) then
hash = hash_request(getenv("HTTP_COOKIE"))
end if
COOKIE = hash
end procedure

Now that we know how cookies get to the script, how can we go the other way and send a cookie to the client. There are 2 options. First, you can set a cookie using the add_cookie procedure. The way cookies are sent is in the header part of the HTTP protocol. Basically it looks something like this...

Content-type: text/html
Set-cookie:name=value
.
.

(Where the 2 periods are supposed to represent new line characters.)

The tricky part here is that you can't accidentally send the header without the cookie data. If you do, you will likely see the cookie data displayed in the web browser rather than stored on the client.

The following 3 functions work in tandem to send cookies.

-- send the protocol header
global procedure send_header()
atom i
i = 1
while i < length(HEADERS) + 1 do
puts(1, HEADERS[i] & "\n")
i += 1
end while
puts(1, "\n\n")
end procedure

procedure append_header(sequence header_data)
HEADERS = append(HEADERS, header_data)
end procedure

global procedure add_cookie(sequence nom, sequence val)
append_header(sprintf("Set-cookie:%s=%s;", { nom, val }))
end procedure

The thing to notice here is that add_cookie doesn't actually send the cookie to the client. Instead, it puts its value in the global array of sequences HEADERS. That way, the cookie data gets sent only when you send the header.

The second way to get cookies on the client is to use JavaScript. This is a little less tricky and well documented on the Internet. The main advantage to this method is that you can set the cookies at any time. Regardless of whether or not you already sent the header. I'm not going to spend any more time here talking about this method. If you want more information on this, I would suggest looking at the demo that came with this tutorial and doing a google for "javascript cookies."

I highly recommend downloading and viewing the code project that goes along with this tutorial. In the tp80_cgilib.e file you will find several functions not mentioned in this series that may help you solve some of your own problems. The project contains a mini example that glues together many of the topics we've been discussing.

Ok, that's it for this series. If you would like to know more about web programing, CGI, or Euphoria, I would suggest you ask your questions on the EUforum Message Board. If you would like to contact me for assistance, please use my contact form at my web site.


To get in touch with the author, visit his website (below).

Ronald Weidner
Website: TechPort80
PHP Programmer For Hire