http-parser

Commit Graph

Author	SHA1	Message	Date
Peter Griess	d7675cd9a6	Add http_parser_parse_url(). - Add an http_parser_parse_url() method to parse a URL into its constituent components. This uses the same underlying parser as http_parser_parse() and doesn't do any data copies. - Re-add the URL components in various test.c structures; validate them when parsing.	14 years ago
Peter Griess	48a4364fdd	Remove some chars from tokens[] per RFC. - Treat ' ' specially, as apparently IIS6.0 can send this in headers. Allow this character through if we're not in strict mode. - Move some test code around so that test indices don't break when HTTP_PARSER_STRICT changes. Fixes #13.	14 years ago
koichik	b47c44d7a6	Fix response body is not read With HTTP/1.1, if neither Content-Length nor Transfer-Encoding is present, section 4.4 of RFC 2616 suggests http-parser needs to read a response body until the connection is closed (except the response must not include a body) See also joyent/node#2457. Fixes #72	14 years ago
Felix Geisendörfer	2498961231	Accept HTTP/0.9 responses See joyent/node#1711	14 years ago
Paul Querna	f1d48aa31c	Move all data to before code to fix http parser for c89.	14 years ago
Fouad Mardini	2b2ba2da1a	rename parser->errno to parser->http_errno; conflicts with errno.h where errno is defined as a macro	14 years ago
Peter Griess	53adfacad1	API CHANGE: Remove path, query, fragment CBs. - Get rid of support for these callbacks in http_parser_settings. - Retain state transitions between different URL portions in http_parser_execute() so that we're making the same correctness guarantees as before. - These are being removed because making multiple callbacks for the same byte makes it more difficult to pause the parser.	14 years ago
Peter Griess	49faf2e9cd	Merge pull request #53 from pgriess/callback_noclear Get rid of CALLBACK_NOCLEAR().	14 years ago
Peter Griess	5469827542	Get rid of CALLBACK_NOCLEAR(). - This was only used by CALLBACK() (which then cleared the mark anyway), and the end of the http_parser_execute() body (after which they go out of scope).	14 years ago
Peter Griess	761a5eaeb1	Break out errno into its own field.	14 years ago
Jon Kolb	8153466643	Group POST refinements, test all request methods, make IS_ALPHA use LOWER internally	14 years ago
Peter Griess	9114e58a77	Facility to report detailed parsing errors. - Add http_errno enum w/ values for many parsing error conditions. Stash this in http_parser.state if the 0x80 bit is set. - Report line numbers on error generation if the (new) HTTP_PARSER_DEBUG cpp symbol is set. Increases http_parser struct size by 8 bytes in this case. - Add http_errno_*() methods to help turning errno values into human-readable messages.	14 years ago
Peter Griess	056bcd3672	Merge pull request #49 from pgriess/upgrade-off-by-one Fix off-by-one in handling upgrade bodies.	14 years ago
Peter Griess	d4ca280af5	Fix off-by-one in handling upgrade bodies. - When handling upgraded bodies, http_parser_execute() used to return one fewer bytes parsed than expected. This caused the final LF to be interpreted by the caller as part of the body. - Add a bunch of upgrade body unit tests.	14 years ago
Cliff Frey	d5f0312eee	remove unused LOWER(ch)	14 years ago
Jon Kolb	a6934445e8	Allow uppercase chars in IS_ALPHANUM	14 years ago
Peter Griess	f684abdcc5	Merge pull request #27 from a2800276/master lowercasing in header after check for CR LF	14 years ago
Jon Kolb	dc314a3cb9	Return error when bad method starts with M or C	14 years ago
Sean Cunningham	b89f94414e	Support multi-line folding in header values. Normal value cb is called for subsequent lines. LWS is skipped. Note that \t whitespace character is now supported after header field name. RFC 2616, Section 2.2 "HTTP/1.1 header field values can be folded onto multiple lines if the continuation line begins with a space or horizontal tab. All linear white space, including folding, has the same semantics as SP. A recipient MAY replace any linear white space with a single SP before interpreting the field value or forwarding the message downstream."	14 years ago
Cliff Frey	3258e4a455	Fix build when char is unsigned by default. I tested by building/testing with -funsigned-char. Thanks to apaprocki for pointing out this problem.	14 years ago
Ryan Dahl	eee60127c0	Support PATCH method Requested in https://groups.google.com/forum/#!topic/nodejs-dev/iEOyiDkJRLs	14 years ago
Peter Griess	3bd18a779e	IS_* macros for char classes. - Add IS_ALPHA(), IS_NUM(), IS_HOST_CHAR(), etc. macros for determining membership in a character class. HTTP_PARSER_STRICT causes some of these definitions to change. - Support '_' character in hostnames in non-strict mode. - Support leading digits in hostnames when the method is HTTP_CONNECT. - Don't re-define HTTP_PARSER_STRICT in http_parser.h if it's already defined. - Tweak Makefile to run non-strict-mode unit tests. Rearrange non-strict mode unit tests in test.c. - Add test_fast to .gitignore. Fixes #44	14 years ago
Ryan Dahl	2839784927	HTTP_STRICT ifdefs out behavior introduced in `50b9bec` Fixes #37.	14 years ago
Peter Griess	b1c2cf83fd	Expose F_* flags as public API. Fixes #42.	14 years ago
Ryan Dahl	8dabce6ec7	It was pointed out we're missing attribution to NGINX	14 years ago
Peter Griess	9639c7c21c	Support ?-terminated hostnames per RFC 2396.3.2. - Bust out of s_req_host and s_req_port on '?'. - Add tests for query string parsing. Fixes #38.	14 years ago
Peter Griess	50b9bec552	Allow octets > 127 in path components. - This is non-spec behavior, but it appears that most HTTP servers implicitly support non-ASCII characters when parsing path components. Extend http-parser to allow this. - Fill out slots [128, 256) in normal_url_char[] with 1 so that these high octets are accepted in path components. - Add unit test for paths that include such non-ASCII characters. Fixes #37.	14 years ago
Ryan Dahl	63daf22f2c	Update copyright headers	14 years ago
Sean Cunningham	10270007bc	Avoid chunk header parsing overflow. Recharacterize the chunk header states such that they are bound by the check for HTTP_MAX_HEADER_SIZE.	15 years ago
Sean Cunningham	81ca70aec1	Avoid chunk trailer overflow. Check for overflow during chunk trailer by removing unnecessary check in macro PARSING_HEADER. This will force the parser to abort if the chunk trailer contains more than HTTP_MAX_HEADER_SIZE of data.	15 years ago
Ryan Dahl	1c3624a963	Detect errors on EOF	15 years ago
Ryan Dahl	fcdbc2629f	Add hack for tmm1	15 years ago
Tim Becker	9656fd73de	moved unecessary lookup	15 years ago
Nathan Rajlich	f825b52b7f	Added support for "SUBSCRIBE" and "UNSUBSCRIBE" request methods.	15 years ago
Nathan Rajlich	d56a0700d0	Add support for "M-SEARCH" and "NOTIFY" request methods. Allow a request path of "*" (for SSDP requests).	15 years ago
Nathan Rajlich	84578ae7a8	Set http_major when a request omits the HTTP version I.E. "GET /" in telnet	15 years ago
Ryan Dahl	37e9009369	Digits in hostname on CONNECT req allowed	15 years ago
Cliff Frey	90320fde7a	Remove acceptable_header array This was not necessary, as it was just being used as a downcase function.	15 years ago
Ryan Dahl	51de89f8b0	Accept tokens + SP for header fields	15 years ago
Ewen Cheslack-Postava	24be793f64	Provide typedefs instead of using stdint.h on Windows.	15 years ago
Nathan Rajlich	a66c61c190	Allow whitespace in the 'Content-Length' header.	15 years ago
Cliff Frey	459507f534	avoid assertion failure in error case Without this change, it is possible to get an assertion to fail by continuing to call http_parser_execute after it has returned an error. Specifically, the parser could be called with parser->state == s_chunk_size_almost_done and parser->flags & F_CHUNKED set. Then, F_CHUNKED could have been cleared, and an error could be hit. In this case, the parser would have returned with F_CHUNKED clear, but parser->state == s_chunk_size_almost_done, resulting in an assertion failure on the next call. There are alternate solutions possible, including just saving all of the fields (state included) on error. I didn't add a test case because this is a bit annoying to test, but I can add one if necesssary.	15 years ago
Ben Noordhuis	cbb194ea8c	Replace C++ style comments with C comments so it compiles with `gcc -ansi -Wall`	15 years ago
Cliff Frey	ca2514dd3a	Array type cleanups. Also save space acceptable_header[x] is always assigned to a variable of type char, so the 'unsigned' is unnecessary. The other arrays can be of type int8_t/uint8_t to save space.	15 years ago
Cliff Frey	423c90d9fe	fixes for architectures with signed char default This could have resulted in memory before the normal_url_char array being read on architectures with signed char default.	15 years ago
Ryan Dahl	6f12467a8a	Use lookup tables of my own.	15 years ago
Jeff Terrace	d0dfc98773	Initialize method member to avoid falsely upgrading connections. Fixed Issue #7	15 years ago
Ryan Dahl	a59ba4d866	Support long messages	15 years ago
Ryan Dahl	120f0f6e09	Allow spaces in header fields	15 years ago
Ryan Dahl	5f27ea8179	Fix long line	15 years ago
Tim Becker	8c3101cbe2	redundant upgrade flag check	15 years ago
Santiago Gala	0264a0aefc	Upgrade on CONNECT method	15 years ago
Cliff Frey	deaee07c86	fix http_parser_init to initialize flags correctly Yay valgrind testing I don't believe that this actually mattered at all, because state was initialized correctly, and flags would be set to 0 almost immediately anyways.	15 years ago
Ryan Dahl	c46b3e3942	Fix typo s_start_res_or_resp	15 years ago
Ryan Dahl	03b8eaa5f8	Reset url_mark on s_req_host add a new scan test. Report and fix by Master Becker.	15 years ago
Ewen Cheslack-Postava	4afe80a44e	Add definitions and typedefs to support compilation in Visual Studio under C++ mode.	15 years ago
Ryan Dahl	ddbd5c3728	Expose http_method_str() to get a string version of a method	15 years ago
Ryan Dahl	9dc258f9dd	Add subversion request methods REPORT, MKACTIVITY, CHECKOUT, MERGE	15 years ago
Cliff Frey	6533f8ac9c	do not access random memory before lowcase array This matters because char is signed by default on x86, so bytes with values above 127 could have theoretically survived a pass through lowcase (assuming that there was some non-zero data before the lowcase array).	15 years ago
Cliff Frey	9eac636531	save more space by removing buffer and shortening method This also fixes test failures from the previous commit. It also adds support for the LOCK method, which was previously missing. This brings the size of http_parser from 44 bytes to 32 bytes. It also makes the code substantially shorter, at a slight cost in craziness.	15 years ago
Cliff Frey	546f43a782	remove body_read This saves space in the structure (it is now 28 bytes on x86), and makes the handling of content_length more consistent between chunked encoding and non-chunked-encoding.	15 years ago
Cliff Frey	2d16d50425	only increment nread while looking at headers This fixes a possible issue where a very large body (one that involves > 80*1024 calls to http_parser_execute) will cause the next request with that parser to return an error because it believes that this is an overflow condition.	15 years ago
Ryan Dahl	4cf39fd2fa	Support request URLs without schema Test case from Poul T Lomholt <pt@lomholt.com>	15 years ago
Ryan Dahl	cdda8b6a60	Support empty header values Test case by Pierre Ruyssen <pierre@ruyssen.fr>	15 years ago
Cliff Frey	8732d108a4	stop tracking lengths of returned values This drops support for MAX_FIELD_SIZE, and saves 4 more bytes in the parser object (44 bytes total now).	15 years ago
Cliff Frey	076fa15132	reduce the size of the http_parser struct The *_mark members were actually being used as just boolean values to the next call of the parser. However, you can calculate if the mark members should be set or not purely based on the current state, so they can just be gotten rid of entirely.	15 years ago
Cliff Frey	0e8ad4e003	reduce size of http_parser object from 104 to 84 bytes by only tracking one field size This does have some slight functional changes in cases where MAX_FIELD_SIZE is hit, specficially if a URL is made up of many components, each of which is smaller than MAX_FIELD_SIZE, but the total together is greater than MAX_FIELD_SIZE, then we now might not call callbacks for any of the components (even the ones that are smaller than 80kb). With the old code, it was possible to get a callback for query_string and never get a callback for the URL (or at least the end of the URL that is past 80kb), if the callback for the URL would have been larger than 80kb. (to be honest, I'm surprised that the MAX_FIELD_SIZE is implemented in http_parser at all, instead of requiring that callers pay attention to it, as it feels like it should be the caller's responsibility)	15 years ago
Ryan Dahl	8beed7ef17	Fix whitespace	15 years ago
Cliff Frey	b8c3336f5d	add support for HTTP_BOTH This is good for analyzing raw streams of data when one is not sure which direction it will be in.	15 years ago
Ryan Dahl	c2acc213ac	Skip body for HEAD responses TODO: need test for response with non-zero content length but no body.	15 years ago
Cliff Frey	7239788205	pass pointer to settings structure rather than pass by value	15 years ago
Ryan Dahl	7cfa645fc7	Fix long chunked message bug The HTTP_MAX_HEADER_SIZE was being consulted at the end of the chunked message (when you look for trailing headers). http://github.com/ry/node/issues#issue/77	15 years ago
Ryan Dahl	88d11b394d	Support Upgrade header	15 years ago
Ryan Dahl	da30924dc8	Use stddef.h	16 years ago
Ryan Dahl	a458431e38	Remove string.h include	16 years ago
Ryan Dahl	e07e0b952e	Tasteful vertical whitespace.	16 years ago
Ryan Dahl	4bce6b4467	Use nginx-style method compare If only just to remove dependency on strncmp().	16 years ago
Ryan Dahl	dbd2dad461	Introduce http_parser_settings	16 years ago
Ryan Dahl	ef14734f6c	Use marcros instead of inline funcs to do callbacks Simplifies things.	16 years ago
Ryan Dahl	8243fddd17	Fix c++ and mac compile errors	16 years ago
Ryan Dahl	1b30bf4ba5	Only allow 80kb of header bytes	16 years ago
Cliff Frey	d5a900264f	Allow newlines before HTTP requests. I have seen cases where a browser will POST data, and then send an extra CRLF before issuing the next request.	16 years ago
Cliff Frey	f167565742	Allow '_' in header fields. Technically anything defined as a 'token' by http://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2 should be allowed, which includes !#$%^&*+-.`~\| and probably others. However this is the only one that I have found in use.	16 years ago
Cliff Frey	6409a5bd17	Allow extra '?' in query strings, and add a test for it.	16 years ago
Cliff Frey	ae8234de93	Prevent uninitialized variable use	16 years ago
Ryan Dahl	9cbd66e49a	Support 'Proxy-Connection' header See http://www.http-stats.com/Proxy-Connection	16 years ago
Ryan Dahl	caef58793e	Update license for 2010	16 years ago
Ryan Dahl	1a677040c0	API: Define parser type in http_parser_init() That is, for a request parser do this: http_parser_init(my_parser, HTTP_REQUEST) for a response parser do this: http_parser_init(my_parser, HTTP_RESPONSE) Then http_parse_requests() and http_parse_responses() both turn into http_parer_execute().	16 years ago
Ryan Dahl	6108b765ce	Bugfix: sometimes servers send \n instead of \r\n	16 years ago
Ryan Dahl	b5b116e59e	Remove unused 's_headers_done' state	16 years ago
Ryan Dahl	79947a7334	Remove EOL whitespace	16 years ago
Ryan Dahl	402eda40a7	Change flag values to bit shifts	16 years ago
Ryan Dahl	0d6cebd70b	wasn't correctly setting method for PROPPATCH and PROPFIND	16 years ago
Ryan Dahl	9c059ec60d	Reimplement support for extension methods This sacrifices - a little space (10 bytes), - a few extra calculations, and - introduces a dependency on strncmp() to dramatically simplify the code of parsing methods and support almost arbitrary extension methods. In the future I will do as NGINX does and not use strncmp but bit level blob comparisons.	16 years ago
Ryan Dahl	12808fe1e6	accept webdav methods	16 years ago
Ryan Dahl	d53606f57e	Add a macros for the usual case	16 years ago
Ryan Dahl	0cbc9101d0	Use error label, instead of returning directly	16 years ago
Ryan Dahl	51e9ff0314	Fix initialization bug. Heap allocate parser in tests, to get errors with valgrind.	16 years ago
Ryan Dahl	873912df5e	Only use s_dead in STRICT mode.	16 years ago
Ryan Dahl	a8f7a3cd78	add message_complete_on_eof test	16 years ago
Ryan Dahl	bd291ab5d8	add license file	16 years ago
Ryan Dahl	5b00b6a64f	add http_should_keep_alive()	16 years ago
Ryan Dahl	5b37977e32	Don't put should_keep_alive messages in front of messages	16 years ago
Ryan Dahl	8f52d451a6	Add http version to tests	16 years ago
Ryan Dahl	717d04ce2d	Optimize increasing the header_index	16 years ago
Ryan Dahl	ca1e011ab3	add response scan, fix persistent bug	16 years ago
Ryan Dahl	fb6dc67b05	strict check	16 years ago
Ryan Dahl	3ac0ebdee5	Passing tests	16 years ago
Ryan Dahl	0642366f0e	change around api	16 years ago
Ryan Dahl	b283cd950f	copyright header	16 years ago
Ryan Dahl	3834853a8a	uri -> url	16 years ago
Ryan Dahl	d931481302	fix bug, first scan works.	16 years ago
Ryan Dahl	b71a17ec85	better output for test_scan	16 years ago
Ryan Dahl	0b8a48049c	Handling chunked messages	16 years ago
Ryan Dahl	c5a92f792f	Now parsing some req headers	16 years ago
Ryan Dahl	433202d825	new version Trashing the old Ragel parser (which was based on Mongrel) because it's proving difficult to get the control I need in end-of-message cases. Replacing this with a hand written parser using a couple tricks borrowed from NGINX. The new parser will be much more work to write, but should prove faster and allow for better hacking.	16 years ago

1 2 3 4 5

216 Commits (master)