在参考网络有关Tinyhttpd的内容后,我打算写下自己的学习过程与自己的理解,我会在结尾附上参考的链接。我把代码放在了gitee:Tinyhttpd学习
1.Tinyhttpd是一个轻量级的HTTP服务器。
2.学习该项目可以学习web服务器在收到静态页面请求和CGI请求的一些基本的处理逻辑。
在开始这一切之前,我并不了解http,所以我先从分析程序开始,到了最后面再写http相关。
在源码httpd.c中,主要有以下函数
void accept_request(void *); //线程函数
void bad_request(int); //出错
void cat(int, FILE *); //send 文件描述符中的数据。
void cannot_execute(int); //出错
void error_die(const char *); //出错
void execute_cgi(int, const char *, const char *, const char *); //执行CGI脚本的函数
int get_line(int, char *, int); //读取socket文件描述符中的数据
void headers(int, const char *); //send 相关数据
void not_found(int); //出错
void serve_file(int, const char *); //send 普通文件
int startup(u_short *); //该函数主要创建socket bind,并且进行监听,并返回socket 函数的返回值(文件描述符)。该函数的传参是u_short类型的端口号。
void unimplemented(int); //出错
下面的图片是源程序的结构
该图片来源与网络,但我根据我的阅读,添加了些许内容,详情请看参考链接。
当执行CGI脚本时候程序的运行结构(图片来源与网络):
以下是源程序,我对代码进行了部分的注释:
/* J. David's webserver */
/* This is a simple webserver.
* Created November 1999 by J. David Blackstone.
* CSE 4344 (Network concepts), Prof. Zeigler
* University of Texas at Arlington
*/
/* This program compiles for Sparc Solaris 2.6.
* To compile for Linux:
* 1) Comment out the #include <pthread.h> line.
* 2) Comment out the line that defines the variable newthread.
* 3) Comment out the two lines that run pthread_create().
* 4) Uncomment the line that runs accept_request().
* 5) Remove -lsocket from the Makefile.
*/
#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <unistd.h>
#include <ctype.h>
#include <strings.h>
#include <string.h>
#include <sys/stat.h>
#include <pthread.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <stdint.h>
#define ISspace(x) isspace((int)(x))
#define SERVER_STRING "Server: jdbhttpd/0.1.0\r\n"
#define STDIN 0
#define STDOUT 1
#define STDERR 2
void accept_request(void *);
void bad_request(int);
void cat(int, FILE *);
void cannot_execute(int);
void error_die(const char *);
void execute_cgi(int, const char *, const char *, const char *);
int get_line(int, char *, int);
void headers(int, const char *);
void not_found(int);
void serve_file(int, const char *);
int startup(u_short *);
void unimplemented(int);
/**********************************************************************/
/* A request has caused a call to accept() on the server port to
* return. Process the request appropriately.
* Parameters: the socket connected to the client */
/**********************************************************************/
void accept_request(void *arg)
{
int client = (intptr_t)arg;
char buf[1024];
size_t numchars;
char method[255];
char url[255];
char path[512];
size_t i, j;
struct stat st;
int cgi = 0; /* becomes true if server decides this is a CGI
* program */
char *query_string = NULL;
//从client 中读取内容到buf,numchars是字符串中的字符个数。
numchars = get_line(client, buf, sizeof(buf));
i = 0; j = 0;
while (!ISspace(buf[i]) && (i < sizeof(method) - 1))
{
//将buf的内容读取到method,最多255个
method[i] = buf[i];
i++;
}
j=i;
method[i] = '\0';
printf("method==%s\n",method);
//比较两个字符串,忽略大小写
if (strcasecmp(method, "GET") && strcasecmp(method, "POST"))
{
//if不是get和post,
unimplemented(client);
return;
}
//如果是POST请求,cgi=1
if (strcasecmp(method, "POST") == 0)
cgi = 1;
i = 0;
//这里的buf[j]是上面没有写入method[]中的内容,即第256个字符之外的内容。
//这里主要是为了过滤空格。
while (ISspace(buf[j]) && (j < numchars))
j++;
while (!ISspace(buf[j]) && (i < sizeof(url) - 1) && (j < numchars))
{
//吧buf中的url传给数组url[]
url[i] = buf[j];
i++; j++;
}
url[i] = '\0';
//如果是GET请求
printf("url==%s\n",url);
if (strcasecmp(method, "GET") == 0)
{
query_string = url;
//在url中找除'?'和'\0'
while ((*query_string != '?') && (*query_string != '\0'))
query_string++;
if (*query_string == '?')
{
//如果有'?',cgi=1,'?'变'\0'
cgi = 1;
printf("iiiii\n");
*query_string = '\0';
query_string++;
}
}
sprintf(path, "htdocs%s", url);
printf("path=%s\n",path);
if (path[strlen(path) - 1] == '/')
strcat(path, "index.html");
printf("path2%s\n",path);
if (stat(path, &st) == -1) {
//请求的页面为找到
while ((numchars > 0) && strcmp("\n", buf)) /* read & discard headers */
numchars = get_line(client, buf, sizeof(buf));
not_found(client);
//结束线程
}
else
{
//如果能够找到
if ((st.st_mode & S_IFMT) == S_IFDIR)
strcat(path, "/index.html");
if ((st.st_mode & S_IXUSR) ||
(st.st_mode & S_IXGRP) ||
(st.st_mode & S_IXOTH) )
cgi = 1;
printf("cgi=%d\n",cgi);
if (!cgi)
{
serve_file(client, path);
}
else
execute_cgi(client, path, method, query_string);
}
close(client);
}
/**********************************************************************/
/* Inform the client that a request it has made has a problem.
* Parameters: client socket */
/**********************************************************************/
void bad_request(int client)
{
char buf[1024];
sprintf(buf, "HTTP/1.0 400 BAD REQUEST\r\n");
send(client, buf, sizeof(buf), 0);
sprintf(buf, "Content-type: text/html\r\n");
send(client, buf, sizeof(buf), 0);
sprintf(buf, "\r\n");
send(client, buf, sizeof(buf), 0);
sprintf(buf, "<P>Your browser sent a bad request, ");
send(client, buf, sizeof(buf), 0);
sprintf(buf, "such as a POST without a Content-Length.\r\n");
send(client, buf, sizeof(buf), 0);
}
/**********************************************************************/
/* Put the entire contents of a file out on a socket. This function
* is named after the UNIX "cat" command, because it might have been
* easier just to do something like pipe, fork, and exec("cat").
* Parameters: the client socket descriptor
* FILE pointer for the file to cat */
/**********************************************************************/
void cat(int client, FILE *resource)
{
char buf[1024];
fgets(buf, sizeof(buf), resource);
while (!feof(resource))
{
send(client, buf, strlen(buf), 0);
fgets(buf, sizeof(buf), resource);
}
}
/**********************************************************************/
/* Inform the client that a CGI script could not be executed.
* Parameter: the client socket descriptor. */
/**********************************************************************/
void cannot_execute(int client)
{
char buf[1024];
sprintf(buf, "HTTP/1.0 500 Internal Server Error\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "Content-type: text/html\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<P>Error prohibited CGI execution.\r\n");
send(client, buf, strlen(buf), 0);
}
/**********************************************************************/
/* Print out an error message with perror() (for system errors; based
* on value of errno, which indicates system call errors) and exit the
* program indicating an error. */
/**********************************************************************/
void error_die(const char *sc)
{
perror(sc);
exit(1);
}
/**********************************************************************/
/* Execute a CGI script. Will need to set environment variables as
* appropriate.
* Parameters: client socket descriptor
* path to the CGI script */
/**********************************************************************/
void execute_cgi(int client, const char *path,
const char *method, const char *query_string)
{
printf("client=%d\n",client);
char buf[1024];
int cgi_output[2];
int cgi_input[2];
pid_t pid;
int status;
int i;
char c;
int numchars = 1;
int content_length = -1;
buf[0] = 'A'; buf[1] = '\0';
//这里还有GET请求
if (strcasecmp(method, "GET") == 0)
{
printf("the get\n");
while ((numchars > 0) && strcmp("\n", buf)) /* read & discard headers */
numchars = get_line(client, buf, sizeof(buf));
}
else if (strcasecmp(method, "POST") == 0) /*POST*/
{
printf("the post\n");
numchars = get_line(client, buf, sizeof(buf));
printf("post buf %s\n",buf);
while ((numchars > 0) && strcmp("\n", buf))
{
buf[15] = '\0';
if (strcasecmp(buf, "Content-Length:") == 0)
content_length = atoi(&(buf[16]));
numchars = get_line(client, buf, sizeof(buf));
}
if (content_length == -1) {
bad_request(client);
return;
}
}
else/*HEAD or other*/
{
}
//管道
printf("%d\n",__LINE__);
if (pipe(cgi_output) < 0) {
cannot_execute(client);
return;
}
if (pipe(cgi_input) < 0) {
cannot_execute(client);
return;
}
if ( (pid = fork()) < 0 ) {
cannot_execute(client);
return;
}
//响应请求
sprintf(buf, "HTTP/1.0 200 OK\r\n");
send(client, buf, strlen(buf), 0);
if (pid == 0) /* child: CGI script */
{
char meth_env[255];
char query_env[255];
char length_env[255];
// sprintf(meth_env, "REQUEST_METHOD=%s", method);
// printf("meth_env=%s\n",meth_env);
dup2(cgi_output[1], STDOUT);
dup2(cgi_input[0], STDIN);
close(cgi_output[0]);
close(cgi_input[1]);
sprintf(meth_env, "REQUEST_METHOD=%s", method);
printf("meth_env=%s\n",meth_env);
//putenv改变或增加一个环境变量
putenv(meth_env);
if (strcasecmp(method, "GET") == 0) {
sprintf(query_env, "QUERY_STRING=%s", query_string);
putenv(query_env);
}
else { /* POST */
sprintf(length_env, "CONTENT_LENGTH=%d", content_length);
putenv(length_env);
}
//执行CGI脚本。
execl(path, NULL);
exit(0);
} else { /* parent */
close(cgi_output[1]);
close(cgi_input[0]);
if (strcasecmp(method, "POST") == 0)
for (i = 0; i < content_length; i++) {
recv(client, &c, 1, 0);
//从client来的请求发送给子进程
printf("c=%c\n",c);
write(cgi_input[1], &c, 1);
}
while (read(cgi_output[0], &c, 1) > 0)
{
printf("cc=%c\n",c);
//将从管道中读取(子进程的来的数据)的数据发给client,也就是网页。
send(client, &c, 1, 0);
}
close(cgi_output[0]);
close(cgi_input[1]);
waitpid(pid, &status, 0);
}
}
/**********************************************************************/
/* Get a line from a socket, whether the line ends in a newline,
* carriage return, or a CRLF combination. Terminates the string read
* with a null character. If no newline indicator is found before the
* end of the buffer, the string is terminated with a null. If any of
* the above three line terminators is read, the last character of the
* string will be a linefeed and the string will be terminated with a
* null character.
* Parameters: the socket descriptor
* the buffer to save the data in
* the size of the buffer
* Returns: the number of bytes stored (excluding null) */
/**********************************************************************/
int get_line(int sock, char *buf, int size)
{
int i = 0;
char c = '\0';
int n;
while ((i < size - 1) && (c != '\n'))
{
n = recv(sock, &c, 1, 0);
/* DEBUG printf("%02X\n", c); */
if (n > 0)
{
if (c == '\r')
{
//MSG_PEEK,此标志使接收操作从接收队列的开头返回数据,而无需从队列中删除该数据。
//因此,随后的接收呼叫将返回相同的数据。
n = recv(sock, &c, 1, MSG_PEEK);
/* DEBUG printf("%02X\n", c); */
if ((n > 0) && (c == '\n'))
//如果\r后面是\n 则继续读取数据,(这里有点奇怪)
recv(sock, &c, 1, 0);
else
//if \r 后面不是\n ,那就让他是\n,不久后循环退出。
c = '\n';
}
buf[i] = c;
i++;
}
else
c = '\n';
}
buf[i] = '\0';
return(i);
}
/**********************************************************************/
/* Return the informational HTTP headers about a file. */
/* Parameters: the socket to print the headers on
* the name of the file */
/**********************************************************************/
void headers(int client, const char *filename)
{
char buf[1024];
(void)filename; /* could use filename to determine file type */
strcpy(buf, "HTTP/1.0 200 OK\r\n");
send(client, buf, strlen(buf), 0);
printf("buf1=%s\n",buf);
strcpy(buf, SERVER_STRING);
send(client, buf, strlen(buf), 0);
printf("buf2=%s\n",buf);
sprintf(buf, "Content-Type: text/html\r\n");
send(client, buf, strlen(buf), 0);
printf("buf3=%s\n",buf);
strcpy(buf, "\r\n");
send(client, buf, strlen(buf), 0);
printf("buf4=%s\n",buf);
}
/**********************************************************************/
/* Give a client a 404 not found status message. */
/**********************************************************************/
void not_found(int client)
{
char buf[1024];
sprintf(buf, "HTTP/1.0 404 NOT FOUND\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, SERVER_STRING);
send(client, buf, strlen(buf), 0);
sprintf(buf, "Content-Type: text/html\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<HTML><TITLE>Not Found</TITLE>\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<BODY><P>The server could not fulfill\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "your request because the resource specified\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "is unavailable or nonexistent.\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "</BODY></HTML>\r\n");
send(client, buf, strlen(buf), 0);
}
/**********************************************************************/
/* Send a regular file to the client. Use headers, and report
* errors to client if they occur.
* Parameters: a pointer to a file structure produced from the socket
* file descriptor
* the name of the file to serve */
/**********************************************************************/
void serve_file(int client, const char *filename)
{
FILE *resource = NULL;
int numchars = 1;
char buf[1024];
buf[0] = 'A'; buf[1] = '\0';
//没有看到这个while的用处
while ((numchars > 0) && strcmp("\n", buf)) /* read & discard headers */
numchars = get_line(client, buf, sizeof(buf));
resource = fopen(filename, "r");
if (resource == NULL)
not_found(client);
else
{
headers(client, filename);
cat(client, resource);
}
fclose(resource);
}
/**********************************************************************/
/* This function starts the process of listening for web connections
* on a specified port. If the port is 0, then dynamically allocate a
* port and modify the original port variable to reflect the actual
* port.
* Parameters: pointer to variable containing the port to connect on
* Returns: the socket */
/**********************************************************************/
int startup(u_short *port)
{
int httpd = 0;
int on = 1;
struct sockaddr_in name;
//PF_INET is AF_INET
httpd = socket(PF_INET, SOCK_STREAM, 0);
if (httpd == -1)
error_die("socket");
//init name value
memset(&name, 0, sizeof(name));
name.sin_family = AF_INET;
name.sin_port = htons(*port);
//INADDR_ANY=0.0.0.0
name.sin_addr.s_addr = htonl(INADDR_ANY);
//设置socket属性
if ((setsockopt(httpd, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on))) < 0)
{
error_die("setsockopt failed");
}
//绑定
if (bind(httpd, (struct sockaddr *)&name, sizeof(name)) < 0)
error_die("bind");
//
//动态的分配端口,从对端得到sin_port
if (*port == 0) /* if dynamically allocating a port */
{
socklen_t namelen = sizeof(name);
if (getsockname(httpd, (struct sockaddr *)&name, &namelen) == -1)
error_die("getsockname");
*port = ntohs(name.sin_port);
}
if (listen(httpd, 5) < 0)
error_die("listen");
return(httpd);
}
/**********************************************************************/
/* Inform the client that the requested web method has not been
* implemented.
* Parameter: the client socket */
/**********************************************************************/
void unimplemented(int client)
{
char buf[1024];
sprintf(buf, "HTTP/1.0 501 Method Not Implemented\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, SERVER_STRING);
send(client, buf, strlen(buf), 0);
sprintf(buf, "Content-Type: text/html\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<HTML><HEAD><TITLE>Method Not Implemented\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "</TITLE></HEAD>\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "<BODY><P>HTTP request method not supported.\r\n");
send(client, buf, strlen(buf), 0);
sprintf(buf, "</BODY></HTML>\r\n");
send(client, buf, strlen(buf), 0);
}
/**********************************************************************/
int main(void)
{
int server_sock = -1;
//u_short port = 4000;
u_short port = 9734;
//刚开始看程序还有点懵,这client_sock怎么是-1呢,后来看到534行,就明白了
int client_sock = -1;
struct sockaddr_in client_name;
socklen_t client_name_len = sizeof(client_name);
pthread_t newthread;
/*建立socket,bind,listen,返回socket的文件描述符*/
server_sock = startup(&port);//自定义函数,
printf("httpd running on port %d\n", port);
/********************************************/
while (1)
{
client_sock = accept(server_sock,
(struct sockaddr *)&client_name,
&client_name_len);
if (client_sock == -1)
error_die("accept");
/* accept_request(&client_sock); */
if (pthread_create(&newthread , NULL, (void *)accept_request, (void *)(intptr_t)client_sock) != 0)
perror("pthread_create");
}
close(server_sock);
return(0);
}
以下是GET和POST,以及CGI脚本的相关知识:
http请求:http请求由三部分组成,分别是:起始行、消息报头、请求正文
Request Line<CRLF>
Header-Name: header-value<CRLF>
Header-Name: header-value<CRLF>
//一个或多个,均以<CRLF>结尾
<CRLF>
body//请求正文
1、起始行以一个方法符号开头,以空格分开,后面跟着请求的URI和协议的版本,格式如下:
Method Request-URI HTTP-Version CRLF
其中 Method表示请求方法;Request-URI是一个统一资源标识符;HTTP-Version表示请求的HTTP协议版本;CRLF表示回车和换行(除了作为结尾的CRLF外,不允许出现单独的CR或LF字符)。
2、请求方法(所有方法全为大写)有多种,各个方法的解释如下:
- GET 请求获取Request-URI所标识的资源
- POST 在Request-URI所标识的资源后附加新的数据
- HEAD 请求获取由Request-URI所标识的资源的响应消息报头
- PUT 请求服务器存储一个资源,并用Request-URI作为其标识
- DELETE 请求服务器删除Request-URI所标识的资源
- TRACE 请求服务器回送收到的请求信息,主要用于测试或诊断
- CONNECT 保留将来使用
- OPTIONS 请求查询服务器的性能,或者查询与资源相关的选项和需求
对应到程序中,void accept_request(void *arg)中有一个method[256]数组,他就是用来储存请求方法的。在源程序中,有GET和POST请求方法。
下面通过运行代码来理解这个过程(这个程序该如何在linux上运行,你需要注意makefile中pthread链接选项,在执行cgi脚本时,你需要注意脚本的解析器在在哪,which perl,这个时候可能还是会有问题,但是不要烦躁,百度一下)。
执行./httpd,然后在浏览器中输入127.0.0.1:9734,注意9734是我自己选的端口号,你的根据你自己程序中定义的来(我是在window下打开的浏览器,因为我使用的是WSL2+ubuntu,在vscode中写程序),输入red点击提交就可以看到。
当输入127.0.0.0:9734回车后响应GET请求,执行serve_file(client, path),打开html页面。打印他们的值如图所示。
打印headers()函数中的数组buf
可以得到如下内容
源代码中的cat()函数主要是发送index.html中的内容,我没有打印cat中发送的内容,有需要自己打印一下。
这样一来你就能看到HTTP对响应GET请求的,他的报文的回复格式
图片来源与网络。
到这里我想我差不都对GET请求有了基本的了解。
一个意外:意外的在GET后得到了下面的结果(待研究,不太清楚是什么原因搞出来的).
输入red,点击提交是POST请求,打印如图所示
我一直奇怪这color.cgi到底是怎么发送给s端的,后来我在index.html中看到了答案。
现在还剩下最后一个问题,那就是管道在程序中的作用是什么。
在程序中子进程中有一行execl(path, NULL),这个是用来执行CGI脚本的。
什么是CGI脚本呢?
看完这个,我认为CGI脚本(子进程)通过cgi_input[0]也即STDIN来从父进程cgi_input[1]获取数据,通过cgi_output[1]也即STDOUT发送数据给父进程cgi_output[0].但是直到现在我还是不能特别清楚的解释那两个环境变量的作用。
对于那两个环境变量的问题,可能需要学习CGI脚本。如果以后有机会接触,在解决这个问题。
如果你有什么疑问或者你认为我写错了,或者知道我在文中遗留问题的答案,欢迎联系我909244296@qq.com
参考文献:
Tinyhttpd精读解析 - nengm - 博客园 (cnblogs.com)
Tinyhttpd项目解析_changfei_1995的博客-CSDN博客_tinyhttpd
HTTP服务器的本质:tinyhttpd源码分析及拓展_IT 哈的博客-CSDN博客_http服务器的本质:tinyhttpd