Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set lr correctly in blx esil on arm #22377

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

bqv
Copy link
Contributor

@bqv bqv commented Nov 9, 2023

  • Mark this if you consider it ready to merge
  • I've added tests (optional)
  • I wrote some lines in the book (optional)

Description

see #22373

@trufae
Copy link
Collaborator

trufae commented Nov 9, 2023

Uhm is your branch based on master?

@bqv
Copy link
Contributor Author

bqv commented Nov 9, 2023

oops nope. 5.8.8, will rebase

@bqv bqv force-pushed the arm-esil-blx branch 2 times, most recently from 04305fb to fc984e3 Compare November 9, 2023 15:59
@trufae
Copy link
Collaborator

trufae commented Nov 10, 2023

i think the problem is that the program counter results in an unaligned address,and the proper behaviour here would be to remove the lower bit and set the thumb mode flag instead

@bqv
Copy link
Contributor Author

bqv commented Nov 11, 2023

This and the issue may have been a misunderstanding, I've updated to a build from master and some things have changed or been solved already; issue I'm trying to solve now is -
Emulating the bx lr instruction below results in a misalignment in disassembly:

[0x00002804]> aes; pd8@pc-8; aer=
│           0x00002828      4000           lsls r0, r0, 1
│           0x0000282a      a0f57050       sub.w r0, r0, 0x3c00        ; arg1
│           0x0000282e      0088           ldrh r0, [r0]               ; arg1
│           ;-- aav.0x00002830:
│           ;-- pc:
│           ;-- r15:
└           *0x002830*      7047           bx lr
            0x00002832      00bf           nop
            0x00002834      7a2c           cmp r4, 0x7a                ; 2'z'
            0x00002836      0000           movs r0, r0
            0x00002838      9222           movs r2, 0x92
     sb 0x00000000       sl 0x00000000       fp 0x20078000       ip 0x00007f14
     sp 0x20077ff8       lr 0x00001de6       pc 0x00002830       r0 0x00000000
     r1 0x00004aa2       r2 0x00000000       r3 0x00000000       r4 0x00000000
     r5 0x00000000       r6 0x00000000       r7 0x20077ff8       r8 0x00000000
     r9 0x00000000      r10 0x00000000      r11 0x20078000      r12 0x00007f14
    r13 0x20077ff8      r14 0x00001de6      r15 0x00002830     cpsr 0x80000000
[0x00002804]> aes; pd8@pc-8; aer=
┌ 12: sym.Java_kl_ime_oh_BinaryDictionary_toAccentLess (int16_t arg3);
│ rg: 1 (vars 0, args 1)
│ bp: 0 (vars 0, args 0)
│ sp: 0 (vars 0, args 0)
│           0x00001ddc      80b5           push {r7, lr}
│           0x00001dde      6f46           mov r7, sp
│           0x00001de0      90b2           uxth r0, r2                 ; arg3
│           0x00001de2  ~   fff77cee       blx fcn.latinime::Dictionary.toAccentLess_unsigned_short_ ; 0x1adc
│           ;-- pc:
│           ;-- r15:
└           *0x001de4*  ~   7cee80bd       cdp p13, 7, c11, c12, c0, 4
│           ;-- lr:
│           ;-- r14:
└           0x00001de6      80bd           pop {r7, pc}
┌ 244: sym.Java_kl_ime_oh_BinaryDictionary_getSuggestionsNative (int16_t arg1, int16_t arg3, int16_t arg4);
│ rg: 3 (vars 0, args 3)
│ bp: 0 (vars 0, args 0)
│ sp: 6 (vars 6, args 0)
│           0x00001de8      f0b5           push {r4, r5, r6, r7, lr}
│           0x00001dea      03af           add r7, sp, 0xc
     sb 0x00000000       sl 0x00000000       fp 0x20078000       ip 0x00007f14
     sp 0x20077ff8       lr 0x00001de6       pc 0x00001de4       r0 0x00000000
     r1 0x00004aa2       r2 0x00000000       r3 0x00000000       r4 0x00000000
     r5 0x00000000       r6 0x00000000       r7 0x20077ff8       r8 0x00000000
     r9 0x00000000      r10 0x00000000      r11 0x20078000      r12 0x00007f14
    r13 0x20077ff8      r14 0x00001de6      r15 0x00001de4     cpsr 0x80000000
[0x00001de4]> 

where the instruction's esil is:

[0x00001de4]> aoe @ 0x2830
0x2830 1,lr,-,pc,:=

Stepping over again, behaves as expected, though. Is this expected? Naturally I can make this go away with asm.flags.middle=1, but it seems strange that pc ends up misaligned

@trufae
Copy link
Collaborator

trufae commented Nov 14, 2023

Sorry for the late response. kinda busy irl :( let me try to give you a proper response here. It's been a while since i really do some thumb/arm binary analysis, and i think it's been always a pending task to be addressed.. and maybe it's the time to do so if you are in the mood to discuss and test things.

I'm not sure the way it works right now, or how it should be working, but basically the way that ARM switches between thumb and arm modes is depending on the lower bit of the program counter which maps to the thumb bit of the cpsr register.

So, i see different things there that must be clarified before moving further.

  • Some ELF symbols tell the disassembler if a function is thumb or arm (but sometimes this doesnt works well and results in bad anal hints. ahb-* is required to get those bins to work) <- there's a ticket for this already.
  • On raw firmwares, the arm/thumb switching is usually spotted by doing esil emulation on the actual program, maybe we should do a collection of assembly snippets that do this, so we can use it for testing in a more precise way.
  • What should ESIL do when PC&1 is set? this is sn arm-thumb-specific thing, and i think it should be clearing the lower bit and setting the thumb flag.
  • Executing an unaligned instruction results in an exception, so I guess the ARM cpus just handle this exception to check the lower bit and do what's described above.
  • But also, branch instructions (the ones where the esil expression modify the PC register) should check for the thumb bit before moving forward, and behave properly.

@@ -2495,7 +2495,7 @@ static int analop_esil(RArchSession *as, RAnalOp *op, ut64 addr, const ut8 *buf,
break;
case ARM_INS_BL:
case ARM_INS_BLX:
r_strbuf_append (&op->esil, "pc,lr,:=,");
r_strbuf_appendf (&op->esil, "pc,%d,+,lr,:=,", thumb);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r_strbuf_appendf (&op->esil, "pc,%d,+,lr,:=,", thumb);
r_strbuf_appendf (&op->esil, "pc,lr,:=,%d,tf,:=", thumb);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about this solution? its working?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not we should probably considering checking for the thumb flag in the register set which is more correct for esil cnoncepts than using the hacky lower bit thing in the program counter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I got sidetracked from all this and am abroad at the moment so can't do much, but i'll defer to your experience here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants