Class type Pxp_lexer_types.lexer_obj


class type lexer_obj = object .. end
A lexer_obj scans from a certain lexer_source. The lexbuf is an internal value of the lexer_obj.

method factory : lexer_factory
The lexer_factory that created this lexer_obj
method encoding : Pxp_core_types.rep_encoding
The character encoding of the scanned strings
method open_source : Pxp_reader.lexer_source -> unit
Drop the current source, and open another source
method open_string : string -> unit
Drop the current source, and open the string as next source
method open_string_inplace : string -> unit
Drop the current source, and open the string as next source. The string is physically used as lexical buffer (no copy is made)
method scan_document : unit -> token * lexers
method scan_content : unit -> token * lexers
method scan_within_tag : unit -> token * lexers
method scan_document_type : unit -> token * lexers
method scan_declaration : unit -> token * lexers
method scan_comment : unit ->
lexers -> token * lexers
method scan_ignored_section : unit -> token * lexers
method detect_xml_pi : unit -> bool
method scan_xml_pi : unit -> prolog_token
method scan_pi_string : unit -> string option
method scan_dtd_string : unit -> token
method scan_content_string : unit -> token
method scan_name_string : unit -> token
method scan_for_crlf : unit -> token
method scan_characters : unit -> unit
method scan_character : unit -> unit
method scan_tag_eb : unit -> token * lexers
method scan_tag_eb_att : unit -> bool -> token * lexers
method lexeme_length : int
The length of the lexeme in characters

For some implementations, this function is very ineffecient.

method lexeme_char : int -> int
Returns one character of the lexeme as Unicode code point

For some implementations, this function is very ineffecient.

method lexeme : string
The lexeme scanned last, encoded as encoding
method lexeme_strlen : int
= String.length lexeme, i.e. number of bytes of the lexeme, not the number of characters
method sub_lexeme : int -> int -> string
A substring of the current lexeme. The arguments are the position and length of the substring in characters (not bytes). The string is encoded in encoding.

For some implementations, this function is very ineffecient.