Comments
Description
Transcript
PPT - Sunface Technologies
Common Gateway Interface Web Technologies Piero Fraternali Outline • Architectures for dynamic content publishing – CGI – Java Servlet – Server-side scripting – JSP tag libraries Motivations • Creating pages on the fly based on the user’s request and from structured data (e.g., database content) • Client-side scripting & components do not suffice – They manipulate an existing document/page, do not create a new one from strutured content • Solution: – Server-side architectures for dynamic content production Common Gateway Interface • An interface that allows the Web Server to launch external applications that create pages dynamically • A kind of «double client-server loop» What CGI is/is not • Is is not – A programming language – A telecommunication protocol • It is – An interface between the web server and tha applications that defines some standard communication variables • The interface is implemented through system variables, a universal mechanism present in all operating systems • A CGI program can be written in any programming language Invocation • The client specifies in the URI the name of the program to invoke • The program must be deployed in a specified location at the web server (e.g., the cgi-bin directory) – http://my.server.web/cgi-bin/xyz.exe Execution • The server recognizes from the URI that the requested resource is an executable – Permissions must be set in the web server for allowing program execution – E.g., the extensions of executable files must be explicitly specified • http://my.server.web/cgi-bin/xyz.exe Execution • The web server decodes the paramaters sent by the client and initializes the CGI variables • request_method, query_string, content_length, content_type • http://my.server.web/cgi-bin/xyz.exe?par=val Execution • The server lauches the program in a new process Execution • The program executes and «prints» the response on the standard output Execution • The server builds the response from the content emitted to the standard output and sends it to the client Handling request parameters • Client paramaters can be sent in two ways – With the HTTP GET method • parameters are appended to the URL (1) • http://www.myserver.it/cgi-bin/xyz?par=val – With the HTTP POST method • Parameters are inserted as an HTTP entity in the body of the request (when their size is substantial) • Requires the use of HTML forms to allow users input data onto the body of the request – (1) The specification of HTTP does not specify any maximum URI length, practical limits are imposed by web browser and server software HTML Form <HTML> <BODY> <FORM action="http://www.mysrvr.it/cgi-bin/xyz.exe" method=post> <P> Tell me your name:<p> <P><INPUT type="text" NAME="whoareyou"> </p> <INPUT type="submit" VALUE="Send"> </FORM> </BODY> </HTML> Structure of a CGI program Read environment variable Execute business logic Print MIME heading Print HTML markup "Content-type: text/html" Parameter decoding Read variable Request_method Read variable Query_string Read variable content_length Read content_length bytes from the standard input CGI development • A CGI program can be written in any programming language: – – – – – – C/C++ Fortran PERL TCL Unix shell Visual Basic • In case a compiled programming language is used, the source code must be compiled – Normally source files are in cgi-src – Executable binaries are in cgi-bin • If instead an interpreted scripting language is used the source files are deployed – Normally in the cgi-bin folder Overview of CGI variables • Clustered per type: – server – request – headers Server variables • These variables are always available, i.e., they do not depend on the request – SERVER_SOFTWARE: name and version of the server software • Format: name/version – SERVER_NAME: hostname or IP of the server – GATEWAY_INTERFACE: supported CGI version • Format: CGI/version Request variables • These variables depend on the request – SERVER_PROTOCOL: transport protocol name and version • Format: protocol/version – SERVER_PORT: port to which the request is sent – REQUEST_METHOD: HTTP request method – PATH_INFO: extra path information – PATH_TRANSLATED: translation of PATH_INFO from virtual to physical – SCRIPT_NAME: invoked script URL – QUERY_STRING: the query string Other request variables • REMOTE_HOST: client hostname • REMOTE_ADDR: client IP address • AUTH_TYPE: authentication type used by the protocol • REMOTE_USER: username used during the authentication • CONTENT_TYPE: content type in case of POST and PUT request methods • CONTENT_LENGTH: content length Environment variables: headers • The HTTP headers contained in the request are stored in the environment with the prefix HTTP_ – HTTP_USER_AGENT: browser used for the request – HTTP_ACCEPT_ENCODING: encoding type accepted by the client – HTTP_ACCEPT_CHARSET: charset accepted by the client – HTTP_ACCEPT_LANGUAGE: language accepted by the client CGI script for inspecting variables #include <stdlib.h> #include <stdio.h> int main (void){ printf("content-type: text/html\n\n"); printf("<html><head><title>Request variables</title></head>"); printf("<body><h1>Some request header variables:</h1>"); fflush(stdout); printf("SERVER_SOFTWARE: %s<br>\n",getenv("SERVER_SOFTWARE")); printf("GATEWAY_INTERFACE: %s<br>\n",getenv("GATEWAY_INTERFACE")); printf("REQUEST_METHOD: %s<br>\n",getenv("REQUEST_METHOD")); printf("QUERY_STRING: %s<br>\n",getenv("QUERY_STRING")); printf("HTTP_USER_AGENT: %s<br>\n",getenv("HTTP_USER_AGENT")); printf("HTTP_ACCEPT_ENCODING: %s<br>\n",getenv("HTTP_ACCEPT_ENCODING")); printf("HTTP_ACCEPT_CHARSET: %s<br>\n",getenv("HTTP_ACCEPT_CHARSET")); printf("HTTP_ACCEPT_LANGUAGE: %s<br>\n",getenv("HTTP_ACCEPT_LANGUAGE")); printf("HTTP_REFERER: %s<br>\n",getenv("HTTP_REFERER")); printf("REMOTE_ADDR: %s<br>\n",getenv("REMOTE_ADDR")); printf("</body></html>"); return 0; } Example output Problems with CGI • Performance and security issues in web server to application communication • When the server receives a request, it creates a new process in order to run the CGI program • This requires time and significant server resources • A CGI program cannot interact back with the web server • The process of the CGI program is terminated when the program finishes • No sharing of resources between subsequen calls (e.g., reuse of database connections) • No main memory preservation of the user’s session (database storage is necessary if session data are to be preserved) • Exposing to the web the physical path to an executable program can breach security Riferimenti • CGI reference: – http://www.w3.org/CGI/ • Security and CGI: – http://www.w3.org/Security/Faq/index.html Esempio completo Form.html 1. Prima richiesta 2. Recupero risorsa 3. Risposta 5. Set variabili d'ambiente e chiamata 4. Seconda richiesta 7. Invio risposta 6. Calcolo Mult.cgi risposta Mult.c Mult.cgi Form.html Precedentemente compilato in... La form (form.html) <HTML> <HEAD><TITLE>Form di moltiplicazione</TITLE><HEAD> <BODY> URL chiamata <FORM ACTION="http://www.polimi.it/cgi-bin/run/mult.cgi"> <P>Introdurre i moltiplicandi</P> <INPUT NAME="m" SIZE="5"><BR/> <INPUT NAME="n" SIZE="5"><BR/> <INPUT TYPE="SUBMIT" VALUE="Moltiplica"> </FORM> <BODY> </HTML> Vista in un browser #include <stdio.h> #include <stdlib.h> int main(void){ Lo script Istruzioni di stampa della risposta sull'output char *data; long m,n; printf("%s%c%c\n", "Content-Type:text/html;charset=iso-88591",13,10); Recupero di printf("<HTML>\n<HEAD>\n<TITLE>Risultato valori dalle moltiplicazione</TITLE>\n<HEAD>\n"); variabili printf("<BODY>\n<H3>Risultato moltiplicazione</H3>\n"); d'ambiente data = getenv("QUERY_STRING"); if(data == NULL) printf("<P>Errore! Errore nel ricevere i dati dalla form.</P>\n"); else if(sscanf(data,"m=%ld&n=%ld",&m,&n)!=2) printf("<P>Errore! Dati non validi. Devono essere numerici.</P>\n"); else printf("<P>Risultato: %ld * %ld = %ld</P>\n",m,n,m*n); printf("<BODY>\n"); return 0; } Compilazione e test locale Set manuale della • Compilazione: $ gcc -o mult.cgi mult.c • Test locale: variabile d'ambiente contenente la query string $ export QUERY_STRING="m=2&n=3" $ ./mult.cgi • Risultato: Content-Type:text/html;charset=iso-8859-1 <HTML> <HEAD> <TITLE>Risultato moltiplicazione</TITLE> <HEAD> <BODY> <H3>Risultato moltiplicazione</H3> <P>Risultato: 2 * 3 = 6</P> <BODY> Considerazioni su CGI • Possibili problemi di sicurezza • Prestazioni (overhead) – creare e terminare processi richiede tempo – cambi di contesto richiedono tempo • Processi CGI: – creati a ciascuna invocazione – non ereditano stato di processo da invocazioni precedenti (e.g., connessioni a database) Riferimenti • CGI reference: http://hoohoo.ncsa.uiuc.edu/cgi/overview.ht ml • Sicurezza e CGI: http://www.w3.org/Security/Faq/wwwsf4.ht ml