3

Is there any grammar specification available for smali code ? I am trying to play around with the smali code and one of the things that is missing me is the fact that some methods in smali have the .prologue section and some don't. Unfortunately the wiki doesn't seem to have information about smali grammar. Has anyone found yourself in this situation before ? Any suggestions/solutions would be much appreciated.

EDIT1: My objective is to add log messages to the beginning of onResume method of all activities of an app.

EDIT2: I am using ANTLRv4.1 parser to parse my smali files and I get a CommonTree (the parse tree) and a TokenStream from the smaliLexer. Now is creating the Token for the log instruction and altering the parse tree and thereafter generating the classes.dex file the right way to go ? So far I havent found a way to alter the TokenStream and I am not able to generate dex files from the altered ParseTree.

Cœur
  • 37,241
  • 25
  • 195
  • 267
TheGT
  • 412
  • 8
  • 13
  • For the actual instruction set I always have a look at this list of opcodes: http://pallergabor.uw.hu/androidblog/dalvik_opcodes.html No idea about the exact specification, though. – dst Aug 02 '13 at 12:48
  • Thanks for the comment @dst. Yes, I do look at the opcodes. I am more interested in the specification of smali like .prologue section, .end method section, .locals section and so on. I couldnt find the specification anywhere and hence the question. – TheGT Aug 02 '13 at 12:54

2 Answers2

3

Almost everything in the smali language has a direct analogue in dalvik bytecode/dex format. In this case, the .prologue directive corresponds to the DBG_SET_PROLOGUE_END debug opcode that is part of the debug_info_item.

From http://s.android.com/tech/dalvik/dex-format.html:

sets the prologue_end state machine register, indicating that the next position entry that is added should be considered the end of a method prologue (an appropriate place for a method breakpoint). The prologue_end register is cleared by any special (>= 0x0a) opcode.

JesusFreke
  • 19,784
  • 5
  • 65
  • 68
  • Thank you. To be more elaborate, I am interested in knowing the first line in a method definition in a .smali file. I thought all the methods have .prologue directive after which the first instruction for the method appears. I found in some of the apps, .prologue section is not present. Is there a generic way to identify the beginning of method instructions in a .smali file ? – TheGT Aug 05 '13 at 17:33
  • It sounds like you're trying to parse the smali files. In order to do that properly... you need to actually parse them, not just search for pieces of text :). A couple of things come to mind. 1. You might look into using the existing ANTLR parser that smali itself uses. 2. You might think about using dexlib2 and working with the dex file directly, rather than having to parse the smali files. – JesusFreke Aug 05 '13 at 19:15
  • Sure, I will take a stab at using ANTLR smali parser or using dexlib2 for parsing smali file and will get back if need be. Thank you for giving me the direction. – TheGT Aug 06 '13 at 17:41
  • 1
    (Note: the dexlib2 library is currently in the dexlib_redesign branch of the smali repository) – JesusFreke Aug 06 '13 at 21:44
  • I tried to insert logging statements in the beginning of every method as follows: 1. Walk the CommonTree obtained during parsing 2. I_METHODS --> I_STATEMENTS. Add a statement corresponding to logging as 0th child of I_STATEMENTS and shift all the other statements index by one, and adjusting the parent node accordingly. [Continued in next comment...] – TheGT Aug 12 '13 at 19:22
  • But the problem I have is w.r.t creating a node in the tree for my statement. It seems I could dupNode, setChildIndex and setTokenStartIndex and setTokenEndIndex; but from my understanding, this requires tokenstream to be updated as well. Is this the only way to inject code into a smali file ? Could you give some inputs on this approach ? – TheGT Aug 12 '13 at 19:22
  • 1
    If you're wanting to actually modify the method, you will likely have better luck using dexlib2. – JesusFreke Aug 13 '13 at 21:40
  • Yes, using dexlib helped my case. Thanks for guiding me here. – TheGT Aug 20 '13 at 17:41
1

You could look at the smali lexer description itself, it is built from a jflex grammar. Skip the preamble code to the line #177 where the tokens specifications begin.

Seki
  • 11,135
  • 7
  • 46
  • 70
  • Thanks Seki - that should be a good starting point. The code you pointed out has all token specifications. What I want is the parser code. But thanks for guiding me in the right direction, I can take a look at the apktool smali parser source code to get a better idea. – TheGT Aug 02 '13 at 16:27