使用Narcissus解析JavaScript代码
最近在做一个有关JavaScript的实验,需要在客户端将JavaScript代码解析为一棵语法树。换句话说,就是一个用JavaScript实现的JavaScript解析器。这方面的选择有很多,常见的yacc、lex或是bison等等都有JavaScript的版本,使用ANTLR也可以将生成目标设为JavaScript。不过我不想在这方面耗费太多时间,自然想找个现成的工具,于是最终我将目标放在了Narcissus上。
Narcissus是一个JavaScript引擎,完全使用JavaScript编写,不过利用了SpiderMonkey的一些扩展,因此无法直接在仅仅实现了ECMAScript 3的引擎上执行(例如各浏览器)。从它的Wikipedia页面上得知,Narcissus由SpiderMonkey的作者Brendan Eich开发,名称来源于希腊神话中爱上自己倒影的人物,和“JavaScript编写的JavaScript引擎”的概念契合(真是太有文化了)。此外,Firefox有一个Zaphod插件,可以将浏览器的JavaScript引擎替换为Narcissus。
Narcissus是个十分简单的JavaScript引擎,可以用来做一些JavaScript语言新特性的探索工作。它几乎不做任何优化,因此不能与其他引擎比拼性能,但很显然它包含完整的JavaScript分析器,正好为我所用。首先,从Github上下载它的源代码,其中包括六个文件,而我只需要其中的三个:
之前提到过,Narcissus不能直接在浏览器上运行,因此我们还必须对它进行修改。首先,是在jsdefs.js文件中,我们需要将开头的一段利用Object.create方法的定义:
(function() { var builderTypes = Object.create(null, { ... }); ... var narcissus = { ... }; Narcissus = narcissus; })();
替换成直接的声明:
var Narcissus = { };
其次还是在jsdefs.js中,我们要改变defineProperty和defineGetter的实现:
function defineGetter(obj, prop, fn, dontDelete, dontEnum) { Object.defineProperty(...); } function defineProperty(obj, prop, val, dontDelete, readOnly, dontEnum) { Object.defineProperty(...); }
Object的defineProperty和defineGetter方法也是SpiderMonkey的扩展,我们要把它们修改为“直接赋值”的版本:
function defineGetter(obj, prop, fn, dontDelete, dontEnum) { obj[prop] = fn; } function defineProperty(obj, prop, val, dontDelete, readOnly, dontEnum) { obj[prop] = val; }
当然,这么做与之前的效果并不等价,不过并不影响代码的使用。您可以从jsparse.js文件中找到使用了这两个方法的地方。
现在您就可以在一个页面里引入这三个JavaScript文件,并Narcissus.parser分析JavaScript代码了。Narcissus几乎没有说明文档,不过从代码中找到它的使用方法并不困难。例如:
function parseSelf() { var builder = new Narcissus.parser.DefaultBuilder(); return Narcissus.parser.parse(builder, parseSelf.toString(), "temp", 1); } document.write("" + parseSelf() + "");
{ type: SCRIPT, children: { type: FUNCTION, body: { type: SCRIPT, children: { type: VAR, children: { type: IDENTIFIER, children: , end: 38, initializer: { type: NEW, children: { type: DOT, children: { type: DOT, children: { type: IDENTIFIER, children: , end: 55, lineno: 2, start: 46, tokenizer: [object Object], value: Narcissus },{ type: IDENTIFIER, children: , end: 62, lineno: 2, start: 56, tokenizer: [object Object], value: parser }, end: 62, lineno: 2, start: 46, tokenizer: [object Object], value: . },{ type: IDENTIFIER, children: , end: 77, lineno: 2, start: 63, tokenizer: [object Object], value: DefaultBuilder }, end: 77, lineno: 2, parenthesized: true, start: 46, tokenizer: [object Object], value: . }, end: 77, lineno: 2, start: 41, tokenizer: [object Object], value: new }, lineno: 2, name: builder, readOnly: false, start: 31, tokenizer: [object Object], value: builder }, destructurings: , end: 38, lineno: 2, start: 27, tokenizer: [object Object], value: var },{ type: RETURN, children: , end: 90, lineno: 3, start: 84, tokenizer: [object Object], value: { type: CALL, children: { type: DOT, children: { type: DOT, children: { type: IDENTIFIER, children: , end: 100, lineno: 3, start: 91, tokenizer: [object Object], value: Narcissus },{ type: IDENTIFIER, children: , end: 107, lineno: 3, start: 101, tokenizer: [object Object], value: parser }, end: 107, lineno: 3, start: 91, tokenizer: [object Object], value: . },{ type: IDENTIFIER, children: , end: 113, lineno: 3, start: 108, tokenizer: [object Object], value: parse }, end: 113, lineno: 3, start: 91, tokenizer: [object Object], value: . },{ type: LIST, children: { type: IDENTIFIER, children: , end: 121, lineno: 3, start: 114, tokenizer: [object Object], value: builder },{ type: CALL, children: { type: DOT, children: { type: IDENTIFIER, children: , end: 132, lineno: 3, start: 123, tokenizer: [object Object], value: parseSelf },{ type: IDENTIFIER, children: , end: 141, lineno: 3, start: 133, tokenizer: [object Object], value: toString }, end: 141, lineno: 3, start: 123, tokenizer: [object Object], value: . },{ type: LIST, children: , end: 142, lineno: 3, start: 141, tokenizer: [object Object], value: ( }, end: 142, lineno: 3, start: 123, tokenizer: [object Object], value: ( },{ type: STRING, children: , end: 151, lineno: 3, start: 145, tokenizer: [object Object], value: temp },{ type: NUMBER, children: , end: 154, lineno: 3, start: 153, tokenizer: [object Object], value: 1 }, end: 154, lineno: 3, start: 113, tokenizer: [object Object], value: ( }, end: 154, lineno: 3, start: 91, tokenizer: [object Object], value: ( } }, end: 90, funDecls: , id: 0, lineno: 1, start: 21, tokenizer: [object Object], value: {, varDecls: { type: IDENTIFIER, children: , end: 38, initializer: { type: NEW, children: { type: DOT, children: { type: DOT, children: { type: IDENTIFIER, children: , end: 55, lineno: 2, start: 46, tokenizer: [object Object], value: Narcissus },{ type: IDENTIFIER, children: , end: 62, lineno: 2, start: 56, tokenizer: [object Object], value: parser }, end: 62, lineno: 2, start: 46, tokenizer: [object Object], value: . },{ type: IDENTIFIER, children: , end: 77, lineno: 2, start: 63, tokenizer: [object Object], value: DefaultBuilder }, end: 77, lineno: 2, parenthesized: true, start: 46, tokenizer: [object Object], value: . }, end: 77, lineno: 2, start: 41, tokenizer: [object Object], value: new }, lineno: 2, name: builder, readOnly: false, start: 31, tokenizer: [object Object], value: builder } }, children: , end: 158, functionForm: 0, lineno: 1, name: parseSelf, params: , start: 0, tokenizer: [object Object], value: function }, funDecls: { type: FUNCTION, body: { type: SCRIPT, children: { type: VAR, children: { type: IDENTIFIER, children: , end: 38, initializer: { type: NEW, children: { type: DOT, children: { type: DOT, children: { type: IDENTIFIER, children: , end: 55, lineno: 2, start: 46, tokenizer: [object Object], value: Narcissus },{ type: IDENTIFIER, children: , end: 62, lineno: 2, start: 56, tokenizer: [object Object], value: parser }, end: 62, lineno: 2, start: 46, tokenizer: [object Object], value: . },{ type: IDENTIFIER, children: , end: 77, lineno: 2, start: 63, tokenizer: [object Object], value: DefaultBuilder }, end: 77, lineno: 2, parenthesized: true, start: 46, tokenizer: [object Object], value: . }, end: 77, lineno: 2, start: 41, tokenizer: [object Object], value: new }, lineno: 2, name: builder, readOnly: false, start: 31, tokenizer: [object Object], value: builder }, destructurings: , end: 38, lineno: 2, start: 27, tokenizer: [object Object], value: var },{ type: RETURN, children: , end: 90, lineno: 3, start: 84, tokenizer: [object Object], value: { type: CALL, children: { type: DOT, children: { type: DOT, children: { type: IDENTIFIER, children: , end: 100, lineno: 3, start: 91, tokenizer: [object Object], value: Narcissus },{ type: IDENTIFIER, children: , end: 107, lineno: 3, start: 101, tokenizer: [object Object], value: parser }, end: 107, lineno: 3, start: 91, tokenizer: [object Object], value: . },{ type: IDENTIFIER, children: , end: 113, lineno: 3, start: 108, tokenizer: [object Object], value: parse }, end: 113, lineno: 3, start: 91, tokenizer: [object Object], value: . },{ type: LIST, children: { type: IDENTIFIER, children: , end: 121, lineno: 3, start: 114, tokenizer: [object Object], value: builder },{ type: CALL, children: { type: DOT, children: { type: IDENTIFIER, children: , end: 132, lineno: 3, start: 123, tokenizer: [object Object], value: parseSelf },{ type: IDENTIFIER, children: , end: 141, lineno: 3, start: 133, tokenizer: [object Object], value: toString }, end: 141, lineno: 3, start: 123, tokenizer: [object Object], value: . },{ type: LIST, children: , end: 142, lineno: 3, start: 141, tokenizer: [object Object], value: ( }, end: 142, lineno: 3, start: 123, tokenizer: [object Object], value: ( },{ type: STRING, children: , end: 151, lineno: 3, start: 145, tokenizer: [object Object], value: temp },{ type: NUMBER, children: , end: 154, lineno: 3, start: 153, tokenizer: [object Object], value: 1 }, end: 154, lineno: 3, start: 113, tokenizer: [object Object], value: ( }, end: 154, lineno: 3, start: 91, tokenizer: [object Object], value: ( } }, end: 90, funDecls: , id: 0, lineno: 1, start: 21, tokenizer: [object Object], value: {, varDecls: { type: IDENTIFIER, children: , end: 38, initializer: { type: NEW, children: { type: DOT, children: { type: DOT, children: { type: IDENTIFIER, children: , end: 55, lineno: 2, start: 46, tokenizer: [object Object], value: Narcissus },{ type: IDENTIFIER, children: , end: 62, lineno: 2, start: 56, tokenizer: [object Object], value: parser }, end: 62, lineno: 2, start: 46, tokenizer: [object Object], value: . },{ type: IDENTIFIER, children: , end: 77, lineno: 2, start: 63, tokenizer: [object Object], value: DefaultBuilder }, end: 77, lineno: 2, parenthesized: true, start: 46, tokenizer: [object Object], value: . }, end: 77, lineno: 2, start: 41, tokenizer: [object Object], value: new }, lineno: 2, name: builder, readOnly: false, start: 31, tokenizer: [object Object], value: builder } }, children: , end: 158, functionForm: 0, lineno: 1, name: parseSelf, params: , start: 0, tokenizer: [object Object], value: function }, id: 0, lineno: 1, tokenizer: [object Object], varDecls: }
在JavaScript中调用一个函数的toString方法会得到它的代码,于是执行上面这段代码会打印出parseSelf方法的语法树。剩下的我就不多说了,爱玩的同学自然知道可以做什么。
补:经过实验,Narcissus还是过于依赖SpiderMonkey引擎的特性,如果要在IE上运行还是需要修改更多内容。此外,最新的Narcissus源码还有一些bug,如果您想要使用合适的实现,不妨参考NarrativeJS中旧版的Narcissus代码。
扫一扫订阅我的微信号:IT技术博客大学习
- 作者:老赵 来源: 老赵点滴
- 标签: Narcissus 解析
- 发布时间:2011-02-13 22:32:32
- [54] IOS安全–浅谈关于IOS加固的几种方法
- [54] android 开发入门
- [52] Oracle MTS模式下 进程地址与会话信
- [52] 图书馆的世界纪录
- [51] Go Reflect 性能
- [50] 如何拿下简短的域名
- [48] 【社会化设计】自我(self)部分――欢迎区
- [48] 读书笔记-壹百度:百度十年千倍的29条法则
- [38] 程序员技术练级攻略
- [31] 视觉调整-设计师 vs. 逻辑