使用Narcissus解析JavaScript代码
最近在做一个有关JavaScript的实验,需要在客户端将JavaScript代码解析为一棵语法树。换句话说,就是一个用JavaScript实现的JavaScript解析器。这方面的选择有很多,常见的yacc、lex或是bison等等都有JavaScript的版本,使用ANTLR也可以将生成目标设为JavaScript。不过我不想在这方面耗费太多时间,自然想找个现成的工具,于是最终我将目标放在了Narcissus上。
Narcissus是一个JavaScript引擎,完全使用JavaScript编写,不过利用了SpiderMonkey的一些扩展,因此无法直接在仅仅实现了ECMAScript 3的引擎上执行(例如各浏览器)。从它的Wikipedia页面上得知,Narcissus由SpiderMonkey的作者Brendan Eich开发,名称来源于希腊神话中爱上自己倒影的人物,和“JavaScript编写的JavaScript引擎”的概念契合(真是太有文化了)。此外,Firefox有一个Zaphod插件,可以将浏览器的JavaScript引擎替换为Narcissus。
Narcissus是个十分简单的JavaScript引擎,可以用来做一些JavaScript语言新特性的探索工作。它几乎不做任何优化,因此不能与其他引擎比拼性能,但很显然它包含完整的JavaScript分析器,正好为我所用。首先,从Github上下载它的源代码,其中包括六个文件,而我只需要其中的三个:
之前提到过,Narcissus不能直接在浏览器上运行,因此我们还必须对它进行修改。首先,是在jsdefs.js文件中,我们需要将开头的一段利用Object.create方法的定义:
(function() {
var builderTypes = Object.create(null, {
...
});
...
var narcissus = {
...
};
Narcissus = narcissus;
})();
替换成直接的声明:
var Narcissus = { };
其次还是在jsdefs.js中,我们要改变defineProperty和defineGetter的实现:
function defineGetter(obj, prop, fn, dontDelete, dontEnum) {
Object.defineProperty(...);
}
function defineProperty(obj, prop, val, dontDelete, readOnly, dontEnum) {
Object.defineProperty(...);
}
Object的defineProperty和defineGetter方法也是SpiderMonkey的扩展,我们要把它们修改为“直接赋值”的版本:
function defineGetter(obj, prop, fn, dontDelete, dontEnum) {
obj[prop] = fn;
}
function defineProperty(obj, prop, val, dontDelete, readOnly, dontEnum) {
obj[prop] = val;
}
当然,这么做与之前的效果并不等价,不过并不影响代码的使用。您可以从jsparse.js文件中找到使用了这两个方法的地方。
现在您就可以在一个页面里引入这三个JavaScript文件,并Narcissus.parser分析JavaScript代码了。Narcissus几乎没有说明文档,不过从代码中找到它的使用方法并不困难。例如:
function parseSelf() {
var builder = new Narcissus.parser.DefaultBuilder();
return Narcissus.parser.parse(builder, parseSelf.toString(), "temp", 1);
}
document.write("" + parseSelf() + "
");{
type: SCRIPT,
children: {
type: FUNCTION,
body: {
type: SCRIPT,
children: {
type: VAR,
children: {
type: IDENTIFIER,
children: ,
end: 38,
initializer: {
type: NEW,
children: {
type: DOT,
children: {
type: DOT,
children: {
type: IDENTIFIER,
children: ,
end: 55,
lineno: 2,
start: 46,
tokenizer: [object Object],
value: Narcissus
},{
type: IDENTIFIER,
children: ,
end: 62,
lineno: 2,
start: 56,
tokenizer: [object Object],
value: parser
},
end: 62,
lineno: 2,
start: 46,
tokenizer: [object Object],
value: .
},{
type: IDENTIFIER,
children: ,
end: 77,
lineno: 2,
start: 63,
tokenizer: [object Object],
value: DefaultBuilder
},
end: 77,
lineno: 2,
parenthesized: true,
start: 46,
tokenizer: [object Object],
value: .
},
end: 77,
lineno: 2,
start: 41,
tokenizer: [object Object],
value: new
},
lineno: 2,
name: builder,
readOnly: false,
start: 31,
tokenizer: [object Object],
value: builder
},
destructurings: ,
end: 38,
lineno: 2,
start: 27,
tokenizer: [object Object],
value: var
},{
type: RETURN,
children: ,
end: 90,
lineno: 3,
start: 84,
tokenizer: [object Object],
value: {
type: CALL,
children: {
type: DOT,
children: {
type: DOT,
children: {
type: IDENTIFIER,
children: ,
end: 100,
lineno: 3,
start: 91,
tokenizer: [object Object],
value: Narcissus
},{
type: IDENTIFIER,
children: ,
end: 107,
lineno: 3,
start: 101,
tokenizer: [object Object],
value: parser
},
end: 107,
lineno: 3,
start: 91,
tokenizer: [object Object],
value: .
},{
type: IDENTIFIER,
children: ,
end: 113,
lineno: 3,
start: 108,
tokenizer: [object Object],
value: parse
},
end: 113,
lineno: 3,
start: 91,
tokenizer: [object Object],
value: .
},{
type: LIST,
children: {
type: IDENTIFIER,
children: ,
end: 121,
lineno: 3,
start: 114,
tokenizer: [object Object],
value: builder
},{
type: CALL,
children: {
type: DOT,
children: {
type: IDENTIFIER,
children: ,
end: 132,
lineno: 3,
start: 123,
tokenizer: [object Object],
value: parseSelf
},{
type: IDENTIFIER,
children: ,
end: 141,
lineno: 3,
start: 133,
tokenizer: [object Object],
value: toString
},
end: 141,
lineno: 3,
start: 123,
tokenizer: [object Object],
value: .
},{
type: LIST,
children: ,
end: 142,
lineno: 3,
start: 141,
tokenizer: [object Object],
value: (
},
end: 142,
lineno: 3,
start: 123,
tokenizer: [object Object],
value: (
},{
type: STRING,
children: ,
end: 151,
lineno: 3,
start: 145,
tokenizer: [object Object],
value: temp
},{
type: NUMBER,
children: ,
end: 154,
lineno: 3,
start: 153,
tokenizer: [object Object],
value: 1
},
end: 154,
lineno: 3,
start: 113,
tokenizer: [object Object],
value: (
},
end: 154,
lineno: 3,
start: 91,
tokenizer: [object Object],
value: (
}
},
end: 90,
funDecls: ,
id: 0,
lineno: 1,
start: 21,
tokenizer: [object Object],
value: {,
varDecls: {
type: IDENTIFIER,
children: ,
end: 38,
initializer: {
type: NEW,
children: {
type: DOT,
children: {
type: DOT,
children: {
type: IDENTIFIER,
children: ,
end: 55,
lineno: 2,
start: 46,
tokenizer: [object Object],
value: Narcissus
},{
type: IDENTIFIER,
children: ,
end: 62,
lineno: 2,
start: 56,
tokenizer: [object Object],
value: parser
},
end: 62,
lineno: 2,
start: 46,
tokenizer: [object Object],
value: .
},{
type: IDENTIFIER,
children: ,
end: 77,
lineno: 2,
start: 63,
tokenizer: [object Object],
value: DefaultBuilder
},
end: 77,
lineno: 2,
parenthesized: true,
start: 46,
tokenizer: [object Object],
value: .
},
end: 77,
lineno: 2,
start: 41,
tokenizer: [object Object],
value: new
},
lineno: 2,
name: builder,
readOnly: false,
start: 31,
tokenizer: [object Object],
value: builder
}
},
children: ,
end: 158,
functionForm: 0,
lineno: 1,
name: parseSelf,
params: ,
start: 0,
tokenizer: [object Object],
value: function
},
funDecls: {
type: FUNCTION,
body: {
type: SCRIPT,
children: {
type: VAR,
children: {
type: IDENTIFIER,
children: ,
end: 38,
initializer: {
type: NEW,
children: {
type: DOT,
children: {
type: DOT,
children: {
type: IDENTIFIER,
children: ,
end: 55,
lineno: 2,
start: 46,
tokenizer: [object Object],
value: Narcissus
},{
type: IDENTIFIER,
children: ,
end: 62,
lineno: 2,
start: 56,
tokenizer: [object Object],
value: parser
},
end: 62,
lineno: 2,
start: 46,
tokenizer: [object Object],
value: .
},{
type: IDENTIFIER,
children: ,
end: 77,
lineno: 2,
start: 63,
tokenizer: [object Object],
value: DefaultBuilder
},
end: 77,
lineno: 2,
parenthesized: true,
start: 46,
tokenizer: [object Object],
value: .
},
end: 77,
lineno: 2,
start: 41,
tokenizer: [object Object],
value: new
},
lineno: 2,
name: builder,
readOnly: false,
start: 31,
tokenizer: [object Object],
value: builder
},
destructurings: ,
end: 38,
lineno: 2,
start: 27,
tokenizer: [object Object],
value: var
},{
type: RETURN,
children: ,
end: 90,
lineno: 3,
start: 84,
tokenizer: [object Object],
value: {
type: CALL,
children: {
type: DOT,
children: {
type: DOT,
children: {
type: IDENTIFIER,
children: ,
end: 100,
lineno: 3,
start: 91,
tokenizer: [object Object],
value: Narcissus
},{
type: IDENTIFIER,
children: ,
end: 107,
lineno: 3,
start: 101,
tokenizer: [object Object],
value: parser
},
end: 107,
lineno: 3,
start: 91,
tokenizer: [object Object],
value: .
},{
type: IDENTIFIER,
children: ,
end: 113,
lineno: 3,
start: 108,
tokenizer: [object Object],
value: parse
},
end: 113,
lineno: 3,
start: 91,
tokenizer: [object Object],
value: .
},{
type: LIST,
children: {
type: IDENTIFIER,
children: ,
end: 121,
lineno: 3,
start: 114,
tokenizer: [object Object],
value: builder
},{
type: CALL,
children: {
type: DOT,
children: {
type: IDENTIFIER,
children: ,
end: 132,
lineno: 3,
start: 123,
tokenizer: [object Object],
value: parseSelf
},{
type: IDENTIFIER,
children: ,
end: 141,
lineno: 3,
start: 133,
tokenizer: [object Object],
value: toString
},
end: 141,
lineno: 3,
start: 123,
tokenizer: [object Object],
value: .
},{
type: LIST,
children: ,
end: 142,
lineno: 3,
start: 141,
tokenizer: [object Object],
value: (
},
end: 142,
lineno: 3,
start: 123,
tokenizer: [object Object],
value: (
},{
type: STRING,
children: ,
end: 151,
lineno: 3,
start: 145,
tokenizer: [object Object],
value: temp
},{
type: NUMBER,
children: ,
end: 154,
lineno: 3,
start: 153,
tokenizer: [object Object],
value: 1
},
end: 154,
lineno: 3,
start: 113,
tokenizer: [object Object],
value: (
},
end: 154,
lineno: 3,
start: 91,
tokenizer: [object Object],
value: (
}
},
end: 90,
funDecls: ,
id: 0,
lineno: 1,
start: 21,
tokenizer: [object Object],
value: {,
varDecls: {
type: IDENTIFIER,
children: ,
end: 38,
initializer: {
type: NEW,
children: {
type: DOT,
children: {
type: DOT,
children: {
type: IDENTIFIER,
children: ,
end: 55,
lineno: 2,
start: 46,
tokenizer: [object Object],
value: Narcissus
},{
type: IDENTIFIER,
children: ,
end: 62,
lineno: 2,
start: 56,
tokenizer: [object Object],
value: parser
},
end: 62,
lineno: 2,
start: 46,
tokenizer: [object Object],
value: .
},{
type: IDENTIFIER,
children: ,
end: 77,
lineno: 2,
start: 63,
tokenizer: [object Object],
value: DefaultBuilder
},
end: 77,
lineno: 2,
parenthesized: true,
start: 46,
tokenizer: [object Object],
value: .
},
end: 77,
lineno: 2,
start: 41,
tokenizer: [object Object],
value: new
},
lineno: 2,
name: builder,
readOnly: false,
start: 31,
tokenizer: [object Object],
value: builder
}
},
children: ,
end: 158,
functionForm: 0,
lineno: 1,
name: parseSelf,
params: ,
start: 0,
tokenizer: [object Object],
value: function
},
id: 0,
lineno: 1,
tokenizer: [object Object],
varDecls:
}
在JavaScript中调用一个函数的toString方法会得到它的代码,于是执行上面这段代码会打印出parseSelf方法的语法树。剩下的我就不多说了,爱玩的同学自然知道可以做什么。
补:经过实验,Narcissus还是过于依赖SpiderMonkey引擎的特性,如果要在IE上运行还是需要修改更多内容。此外,最新的Narcissus源码还有一些bug,如果您想要使用合适的实现,不妨参考NarrativeJS中旧版的Narcissus代码。
扫一扫订阅我的微信号:IT技术博客大学习
- 作者:老赵 来源: 老赵点滴
- 标签: Narcissus 解析
- 发布时间:2011-02-13 22:32:32
-
[899] WordPress插件开发 -- 在插件使用 -
[135] 解决 nginx 反向代理网页首尾出现神秘字 -
[56] 整理了一份招PHP高级工程师的面试题 -
[55] Innodb分表太多或者表分区太多,会导致内 -
[53] 如何保证一个程序在单台服务器上只有唯一实例( -
[52] 全站换域名时利用nginx和javascri -
[52] CloudSMS:免费匿名的云短信 -
[52] 海量小文件存储 -
[52] 用 Jquery 模拟 select -
[51] 分享一个JQUERY颜色选择插件