summaryrefslogtreecommitdiff
path: root/apps/codecs/lib/udiv32_armv4.S (follow)
Commit message (Collapse)AuthorAge
* Improvements to specialized dividers for APE codec:Andrew Mahone2010-01-28
| | | | | | | | | * Use Newton-Raphson divider on ARMv5e and ARMv6, about 7% speedup on Gigabeat S. * On ARMv4 targets using IRAM, remove insane filter buffer from IRAM, fill available IRAM with LUT of reciprocals for small divisors - speedup varies according to target and available IRAM, APE normal sample is approx. 109% RT on e200. * Rename apps/codecs/lib/udiv32_armv4.S to apps/codecs/lib/udiv32_arm.S, which includes dividers for all ARM targets specialized for APE. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24354 a1c6a512-1295-4272-9138-f99709370657
* Invert divisor earlier in udiv32_arm, allowing the div0 test to be done ↵Andrew Mahone2010-01-03
| | | | | | before entering the 32-bit divide portion of the code, and making the handling of div0 simpler. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24166 a1c6a512-1295-4272-9138-f99709370657
* Use long jump to reach __div0 from udiv32_arm if building with IRAM and ↵Andrew Mahone2010-01-03
| | | | | | without EABI. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24152 a1c6a512-1295-4272-9138-f99709370657
* More comments for udiv32_armv4.S, reduce zero divisor test to one cycle for ↵Andrew Mahone2010-01-03
| | | | | | the skipped branch by setting flags when inverting divisor, 32-bit numerators are handled by calling the 31-bit divider and fixing the results. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24151 a1c6a512-1295-4272-9138-f99709370657
* Add missing EOF newline.Andrew Mahone2010-01-02
| | | | git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24143 a1c6a512-1295-4272-9138-f99709370657
* Remove special cases from udiv32_armv4.S, except for zero divisor and large ↵Andrew Mahone2010-01-02
| | | | | | numerator. Improvement of 1.23MHz on e200 with ape normal. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24142 a1c6a512-1295-4272-9138-f99709370657
* Add 31/31-bit unsigned division in apps/codecs/lib/udiv_arm.S, with 2 cycles ↵Andrew Mahone2009-12-31
| | | | | | / iteration, falling back to previous 32-bit, 3 cycle / iteration code when needed (well under 1% of divisions in sample file). APE normal sample is now 96.90% realtime, approx 1.3% improved vs svn. TODO: unify divisor normalization for both trial subtraction routines, possibly use divisor bits to select 31- vs 32-bit division. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24130 a1c6a512-1295-4272-9138-f99709370657
* ARMv4 unsigned integer division: Using an overflow-safe comparison method in ↵Jens Arnold2008-11-06
| | | | | | the main calculation allows to put back the 1.5 cyle (average) optimisation. Shaved off another instruction, as we don't need the remainder. * Use the very efficient ffs algorithm from ffs-arm.S for dividing by a power of 2. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19032 a1c6a512-1295-4272-9138-f99709370657
* This optimisation breaks for very large divisors (MSB set), so remove it.Jens Arnold2008-11-05
| | | | git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19012 a1c6a512-1295-4272-9138-f99709370657
* Further optimised (vs. libgcc) unsigned 32 bit division for ARMv4 (based on ↵Jens Arnold2008-11-05
the ARMv5(+) version from libgcc), in IRAM on PP for better performance on PP5002, and put into the codeclib for possible reuse. APE -c1000 is now usable on both PP502x and PP5002 (~138% realtime, they're on par now). Gigabeat F/X should also see an APE speedup. git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19009 a1c6a512-1295-4272-9138-f99709370657