技术头条 - 一个快速在微博传播文章的方式     搜索本站
您现在的位置首页 --> JavaScript --> 使用Narcissus解析JavaScript代码

使用Narcissus解析JavaScript代码

浏览:934次  出处信息

    最近在做一个有关JavaScript的实验,需要在客户端将JavaScript代码解析为一棵语法树。换句话说,就是一个用JavaScript实现的JavaScript解析器。这方面的选择有很多,常见的yacc、lex或是bison等等都有JavaScript的版本,使用ANTLR也可以将生成目标设为JavaScript。不过我不想在这方面耗费太多时间,自然想找个现成的工具,于是最终我将目标放在了Narcissus上。

    Narcissus是一个JavaScript引擎,完全使用JavaScript编写,不过利用了SpiderMonkey的一些扩展,因此无法直接在仅仅实现了ECMAScript 3的引擎上执行(例如各浏览器)。从它的Wikipedia页面上得知,Narcissus由SpiderMonkey的作者Brendan Eich开发,名称来源于希腊神话中爱上自己倒影的人物,和“JavaScript编写的JavaScript引擎”的概念契合(真是太有文化了)。此外,Firefox有一个Zaphod插件,可以将浏览器的JavaScript引擎替换为Narcissus。

    Narcissus是个十分简单的JavaScript引擎,可以用来做一些JavaScript语言新特性的探索工作。它几乎不做任何优化,因此不能与其他引擎比拼性能,但很显然它包含完整的JavaScript分析器,正好为我所用。首先,从Github上下载它的源代码,其中包括六个文件,而我只需要其中的三个:

  • jsdef.js:包含了Narcissus.definitions组件,各种Token定义等等。
  • jslex.js:包含了Narcissus.lexer组件,分词器。
  • jsparse.js:包含了Narcissus.parser,分析器。
  •     之前提到过,Narcissus不能直接在浏览器上运行,因此我们还必须对它进行修改。首先,是在jsdefs.js文件中,我们需要将开头的一段利用Object.create方法的定义:

    (function() {
        var builderTypes = Object.create(null, {
            ...
        });
    
        ...
    
        var narcissus = {
            ...
        };
    
        Narcissus = narcissus;
    })();

        替换成直接的声明:

    var Narcissus = { };

        其次还是在jsdefs.js中,我们要改变defineProperty和defineGetter的实现:

    function defineGetter(obj, prop, fn, dontDelete, dontEnum) {
        Object.defineProperty(...);
    }
    
    function defineProperty(obj, prop, val, dontDelete, readOnly, dontEnum) {
        Object.defineProperty(...);
    }

        Object的defineProperty和defineGetter方法也是SpiderMonkey的扩展,我们要把它们修改为“直接赋值”的版本:

    function defineGetter(obj, prop, fn, dontDelete, dontEnum) {
        obj[prop] = fn;
    }
    
    function defineProperty(obj, prop, val, dontDelete, readOnly, dontEnum) {
        obj[prop] = val;
    }

        当然,这么做与之前的效果并不等价,不过并不影响代码的使用。您可以从jsparse.js文件中找到使用了这两个方法的地方。

        现在您就可以在一个页面里引入这三个JavaScript文件,并Narcissus.parser分析JavaScript代码了。Narcissus几乎没有说明文档,不过从代码中找到它的使用方法并不困难。例如:

    function parseSelf() { 
        var builder = new Narcissus.parser.DefaultBuilder();
        return Narcissus.parser.parse(builder, parseSelf.toString(), "temp", 1);
    }
    
    document.write("
    " + parseSelf() + "
    ");
    {
        type: SCRIPT,
        children: {
            type: FUNCTION,
            body: {
                type: SCRIPT,
                children: {
                    type: VAR,
                    children: {
                        type: IDENTIFIER,
                        children: ,
                        end: 38,
                        initializer: {
                            type: NEW,
                            children: {
                                type: DOT,
                                children: {
                                    type: DOT,
                                    children: {
                                        type: IDENTIFIER,
                                        children: ,
                                        end: 55,
                                        lineno: 2,
                                        start: 46,
                                        tokenizer: [object Object],
                                        value: Narcissus
                                    },{
                                        type: IDENTIFIER,
                                        children: ,
                                        end: 62,
                                        lineno: 2,
                                        start: 56,
                                        tokenizer: [object Object],
                                        value: parser
                                    },
                                    end: 62,
                                    lineno: 2,
                                    start: 46,
                                    tokenizer: [object Object],
                                    value: .
                                },{
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 77,
                                    lineno: 2,
                                    start: 63,
                                    tokenizer: [object Object],
                                    value: DefaultBuilder
                                },
                                end: 77,
                                lineno: 2,
                                parenthesized: true,
                                start: 46,
                                tokenizer: [object Object],
                                value: .
                            },
                            end: 77,
                            lineno: 2,
                            start: 41,
                            tokenizer: [object Object],
                            value: new
                        },
                        lineno: 2,
                        name: builder,
                        readOnly: false,
                        start: 31,
                        tokenizer: [object Object],
                        value: builder
                    },
                    destructurings: ,
                    end: 38,
                    lineno: 2,
                    start: 27,
                    tokenizer: [object Object],
                    value: var
                },{
                    type: RETURN,
                    children: ,
                    end: 90,
                    lineno: 3,
                    start: 84,
                    tokenizer: [object Object],
                    value: {
                        type: CALL,
                        children: {
                            type: DOT,
                            children: {
                                type: DOT,
                                children: {
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 100,
                                    lineno: 3,
                                    start: 91,
                                    tokenizer: [object Object],
                                    value: Narcissus
                                },{
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 107,
                                    lineno: 3,
                                    start: 101,
                                    tokenizer: [object Object],
                                    value: parser
                                },
                                end: 107,
                                lineno: 3,
                                start: 91,
                                tokenizer: [object Object],
                                value: .
                            },{
                                type: IDENTIFIER,
                                children: ,
                                end: 113,
                                lineno: 3,
                                start: 108,
                                tokenizer: [object Object],
                                value: parse
                            },
                            end: 113,
                            lineno: 3,
                            start: 91,
                            tokenizer: [object Object],
                            value: .
                        },{
                            type: LIST,
                            children: {
                                type: IDENTIFIER,
                                children: ,
                                end: 121,
                                lineno: 3,
                                start: 114,
                                tokenizer: [object Object],
                                value: builder
                            },{
                                type: CALL,
                                children: {
                                    type: DOT,
                                    children: {
                                        type: IDENTIFIER,
                                        children: ,
                                        end: 132,
                                        lineno: 3,
                                        start: 123,
                                        tokenizer: [object Object],
                                        value: parseSelf
                                    },{
                                        type: IDENTIFIER,
                                        children: ,
                                        end: 141,
                                        lineno: 3,
                                        start: 133,
                                        tokenizer: [object Object],
                                        value: toString
                                    },
                                    end: 141,
                                    lineno: 3,
                                    start: 123,
                                    tokenizer: [object Object],
                                    value: .
                                },{
                                    type: LIST,
                                    children: ,
                                    end: 142,
                                    lineno: 3,
                                    start: 141,
                                    tokenizer: [object Object],
                                    value: (
                                },
                                end: 142,
                                lineno: 3,
                                start: 123,
                                tokenizer: [object Object],
                                value: (
                            },{
                                type: STRING,
                                children: ,
                                end: 151,
                                lineno: 3,
                                start: 145,
                                tokenizer: [object Object],
                                value: temp
                            },{
                                type: NUMBER,
                                children: ,
                                end: 154,
                                lineno: 3,
                                start: 153,
                                tokenizer: [object Object],
                                value: 1
                            },
                            end: 154,
                            lineno: 3,
                            start: 113,
                            tokenizer: [object Object],
                            value: (
                        },
                        end: 154,
                        lineno: 3,
                        start: 91,
                        tokenizer: [object Object],
                        value: (
                    }
                },
                end: 90,
                funDecls: ,
                id: 0,
                lineno: 1,
                start: 21,
                tokenizer: [object Object],
                value: {,
                varDecls: {
                    type: IDENTIFIER,
                    children: ,
                    end: 38,
                    initializer: {
                        type: NEW,
                        children: {
                            type: DOT,
                            children: {
                                type: DOT,
                                children: {
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 55,
                                    lineno: 2,
                                    start: 46,
                                    tokenizer: [object Object],
                                    value: Narcissus
                                },{
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 62,
                                    lineno: 2,
                                    start: 56,
                                    tokenizer: [object Object],
                                    value: parser
                                },
                                end: 62,
                                lineno: 2,
                                start: 46,
                                tokenizer: [object Object],
                                value: .
                            },{
                                type: IDENTIFIER,
                                children: ,
                                end: 77,
                                lineno: 2,
                                start: 63,
                                tokenizer: [object Object],
                                value: DefaultBuilder
                            },
                            end: 77,
                            lineno: 2,
                            parenthesized: true,
                            start: 46,
                            tokenizer: [object Object],
                            value: .
                        },
                        end: 77,
                        lineno: 2,
                        start: 41,
                        tokenizer: [object Object],
                        value: new
                    },
                    lineno: 2,
                    name: builder,
                    readOnly: false,
                    start: 31,
                    tokenizer: [object Object],
                    value: builder
                }
            },
            children: ,
            end: 158,
            functionForm: 0,
            lineno: 1,
            name: parseSelf,
            params: ,
            start: 0,
            tokenizer: [object Object],
            value: function
        },
        funDecls: {
            type: FUNCTION,
            body: {
                type: SCRIPT,
                children: {
                    type: VAR,
                    children: {
                        type: IDENTIFIER,
                        children: ,
                        end: 38,
                        initializer: {
                            type: NEW,
                            children: {
                                type: DOT,
                                children: {
                                    type: DOT,
                                    children: {
                                        type: IDENTIFIER,
                                        children: ,
                                        end: 55,
                                        lineno: 2,
                                        start: 46,
                                        tokenizer: [object Object],
                                        value: Narcissus
                                    },{
                                        type: IDENTIFIER,
                                        children: ,
                                        end: 62,
                                        lineno: 2,
                                        start: 56,
                                        tokenizer: [object Object],
                                        value: parser
                                    },
                                    end: 62,
                                    lineno: 2,
                                    start: 46,
                                    tokenizer: [object Object],
                                    value: .
                                },{
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 77,
                                    lineno: 2,
                                    start: 63,
                                    tokenizer: [object Object],
                                    value: DefaultBuilder
                                },
                                end: 77,
                                lineno: 2,
                                parenthesized: true,
                                start: 46,
                                tokenizer: [object Object],
                                value: .
                            },
                            end: 77,
                            lineno: 2,
                            start: 41,
                            tokenizer: [object Object],
                            value: new
                        },
                        lineno: 2,
                        name: builder,
                        readOnly: false,
                        start: 31,
                        tokenizer: [object Object],
                        value: builder
                    },
                    destructurings: ,
                    end: 38,
                    lineno: 2,
                    start: 27,
                    tokenizer: [object Object],
                    value: var
                },{
                    type: RETURN,
                    children: ,
                    end: 90,
                    lineno: 3,
                    start: 84,
                    tokenizer: [object Object],
                    value: {
                        type: CALL,
                        children: {
                            type: DOT,
                            children: {
                                type: DOT,
                                children: {
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 100,
                                    lineno: 3,
                                    start: 91,
                                    tokenizer: [object Object],
                                    value: Narcissus
                                },{
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 107,
                                    lineno: 3,
                                    start: 101,
                                    tokenizer: [object Object],
                                    value: parser
                                },
                                end: 107,
                                lineno: 3,
                                start: 91,
                                tokenizer: [object Object],
                                value: .
                            },{
                                type: IDENTIFIER,
                                children: ,
                                end: 113,
                                lineno: 3,
                                start: 108,
                                tokenizer: [object Object],
                                value: parse
                            },
                            end: 113,
                            lineno: 3,
                            start: 91,
                            tokenizer: [object Object],
                            value: .
                        },{
                            type: LIST,
                            children: {
                                type: IDENTIFIER,
                                children: ,
                                end: 121,
                                lineno: 3,
                                start: 114,
                                tokenizer: [object Object],
                                value: builder
                            },{
                                type: CALL,
                                children: {
                                    type: DOT,
                                    children: {
                                        type: IDENTIFIER,
                                        children: ,
                                        end: 132,
                                        lineno: 3,
                                        start: 123,
                                        tokenizer: [object Object],
                                        value: parseSelf
                                    },{
                                        type: IDENTIFIER,
                                        children: ,
                                        end: 141,
                                        lineno: 3,
                                        start: 133,
                                        tokenizer: [object Object],
                                        value: toString
                                    },
                                    end: 141,
                                    lineno: 3,
                                    start: 123,
                                    tokenizer: [object Object],
                                    value: .
                                },{
                                    type: LIST,
                                    children: ,
                                    end: 142,
                                    lineno: 3,
                                    start: 141,
                                    tokenizer: [object Object],
                                    value: (
                                },
                                end: 142,
                                lineno: 3,
                                start: 123,
                                tokenizer: [object Object],
                                value: (
                            },{
                                type: STRING,
                                children: ,
                                end: 151,
                                lineno: 3,
                                start: 145,
                                tokenizer: [object Object],
                                value: temp
                            },{
                                type: NUMBER,
                                children: ,
                                end: 154,
                                lineno: 3,
                                start: 153,
                                tokenizer: [object Object],
                                value: 1
                            },
                            end: 154,
                            lineno: 3,
                            start: 113,
                            tokenizer: [object Object],
                            value: (
                        },
                        end: 154,
                        lineno: 3,
                        start: 91,
                        tokenizer: [object Object],
                        value: (
                    }
                },
                end: 90,
                funDecls: ,
                id: 0,
                lineno: 1,
                start: 21,
                tokenizer: [object Object],
                value: {,
                varDecls: {
                    type: IDENTIFIER,
                    children: ,
                    end: 38,
                    initializer: {
                        type: NEW,
                        children: {
                            type: DOT,
                            children: {
                                type: DOT,
                                children: {
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 55,
                                    lineno: 2,
                                    start: 46,
                                    tokenizer: [object Object],
                                    value: Narcissus
                                },{
                                    type: IDENTIFIER,
                                    children: ,
                                    end: 62,
                                    lineno: 2,
                                    start: 56,
                                    tokenizer: [object Object],
                                    value: parser
                                },
                                end: 62,
                                lineno: 2,
                                start: 46,
                                tokenizer: [object Object],
                                value: .
                            },{
                                type: IDENTIFIER,
                                children: ,
                                end: 77,
                                lineno: 2,
                                start: 63,
                                tokenizer: [object Object],
                                value: DefaultBuilder
                            },
                            end: 77,
                            lineno: 2,
                            parenthesized: true,
                            start: 46,
                            tokenizer: [object Object],
                            value: .
                        },
                        end: 77,
                        lineno: 2,
                        start: 41,
                        tokenizer: [object Object],
                        value: new
                    },
                    lineno: 2,
                    name: builder,
                    readOnly: false,
                    start: 31,
                    tokenizer: [object Object],
                    value: builder
                }
            },
            children: ,
            end: 158,
            functionForm: 0,
            lineno: 1,
            name: parseSelf,
            params: ,
            start: 0,
            tokenizer: [object Object],
            value: function
        },
        id: 0,
        lineno: 1,
        tokenizer: [object Object],
        varDecls: 
    }

        在JavaScript中调用一个函数的toString方法会得到它的代码,于是执行上面这段代码会打印出parseSelf方法的语法树。剩下的我就不多说了,爱玩的同学自然知道可以做什么。

        补:经过实验,Narcissus还是过于依赖SpiderMonkey引擎的特性,如果要在IE上运行还是需要修改更多内容。此外,最新的Narcissus源码还有一些bug,如果您想要使用合适的实现,不妨参考NarrativeJS中旧版的Narcissus代码。

    QQ技术交流群:445447336,欢迎加入!
    扫一扫订阅我的微信号:IT技术博客大学习
    © 2009 - 2024 by blogread.cn 微博:@IT技术博客大学习

    京ICP备15002552号-1