Vertex shader ops investigation


Views: 2,223,592	Main \| Rules/FAQ \| Memberlist \| Active users \| Last posts \| Calendar \| Stats \| Online users \| Search	03-18-26 06:03 AM
Guest:

0 users reading Vertex shader ops investigation | 1 bot

Main - Reverse-engineering - Vertex shader ops investigation

Hide post layouts | New reply

Arisotura

Posted on 11-06-14 06:01 PM

Link | #13

Member
blarg

Level: 30

Posts: 12/184
EXP: 160966
Next: 4903

Since: 10-27-14
From: France

Last post: 3130 days ago
Last view: 3040 days ago

Test operation is the following:

[op] d1A, d01, d25 (0x6)

d01 holds a texcoord.
d25 is a constant set to: 0, 0, 0.5, 1
d1A is a temporary register that is later mov'd to the texcoord output registers.

Opdesc 0x6 is declared as follows:
.opdesc xyzw, xyzw, zzzz ; 0x6

So, if I got all that junk right, it should really be, in pseudo-GLSL:

d1A.xyzw = d01.xyzw [op] d25.zzzz;

In practice I must have done something wrong. None of the known opcodes behaved as expected.

They either had no effect on the texcoords or turned them into all the same value.

____________________
blargSNES -- SNES emu for 3DS
More cool stuff

Arisotura

Posted on 11-06-14 09:54 PM

Link | #14

Member
blarg

Level: 30

Posts: 13/184
EXP: 160966
Next: 4903

Since: 10-27-14
From: France

Last post: 3130 days ago
Last view: 3040 days ago

Okay.

mul d1A, d25, d01 (0x6) <- okay
mul d1A, d01, d25 (0x6) <- doesn't work

Note the order of operands.

(each with a properly adjusted opdesc)

____________________
blargSNES -- SNES emu for 3DS
More cool stuff

neobrain

Posted on 11-15-14 11:40 AM

Link | #23

Member
Normal user

Level: 11

Posts: 2/17
EXP: 4510
Next: 1475

Since: 11-15-14

Last post: 3795 days ago
Last view: 3284 days ago

I mentioned this on IRC already, but for reference: I think talking about this in a shader language just adds a second level of confusion, hence I suggest providing the underlying bytecode in hex along to each assembly line. That's probably a bit more effort, but at least we then can be sure that if something "odd" is happening it is because of the GPU, rather than due to some misdesign of the shader assembler.

For example, I think one issue with what you explained out is that you try to assign a floating point register to the second source argument of the mul instruction. However, as far as I am aware src2 only has 5 bits available for addressing registers, which is why 0x25 cannot be indexed by that (contrary to src1, which has 7 bits). Obviously, this issue only manifests itself in aemstro's shading language, because had you worked with bytecode directly, you wouldn't even have thought of trying to assign 0x25 to src2.

By the way, nihstro is aware of this limitation and hence swaps the values of src1 and src2 if necessary.

Main - Reverse-engineering - Vertex shader ops investigation

Hide post layouts | New reply

Page rendered in 0.013 seconds. (2048KB of memory used)
MySQL - queries: 28, rows: 69/69, time: 0.007 seconds.
[powered by Acmlm]