Input On DVD Rebuilder Guide
|
|
AfterDawn Addict
1 product review
|
2. October 2005 @ 11:36 |
Link to this message
|
Hmm, right I guess I overlooked that. I suggest that you use my settings since the Northwood core also uses SSE2 instructions and SSE2 instructions increase encode speeds.
" Please Read!!! Post your questions only in This Thread or they will go unanswered:
Help with development of BD RB: Donations at: http://www.jdobbs.com/.
|
Advertisement
|
|
|
brobear
Suspended permanently
|
2. October 2005 @ 12:28 |
Link to this message
|
As I mentioned before, I tried it with no apparent difference. I encoded the same files with only the IDCT settings different. I'll give it another try just to make sure.
|
AfterDawn Addict
1 product review
|
2. October 2005 @ 12:31 |
Link to this message
|
SSE2 instructions should provide some speed gains however small.
" Please Read!!! Post your questions only in This Thread or they will go unanswered:
Help with development of BD RB: Donations at: http://www.jdobbs.com/.
|
jdobbs
Senior Member
|
2. October 2005 @ 13:56 |
Link to this message
|
The two Socrates mentioned are different one is SSE and the other is SSE2.
The two SSE/MMX versions are just different algorithms that use the same precision levels, one was designed by SKAL.
|
brobear
Suspended permanently
|
2. October 2005 @ 14:48 |
Link to this message
|
@jdobbs
As for the 2 SSE/MMX settings, is one algorithm any better than the other or just different?
This message has been edited since posting. Last time this message was edited on 3. October 2005 @ 02:08
|
brobear
Suspended permanently
|
3. October 2005 @ 02:07 |
Link to this message
|
|
64026402
Senior Member
|
3. October 2005 @ 03:50 |
Link to this message
|
Yawn.
I read about the AMD x2 beating the Pentium D EE some time ago. Late to the party Mr Bear.:)
Donald
|
64026402
Senior Member
|
3. October 2005 @ 03:54 |
Link to this message
|
As for Idct you should use SSE2/MMX for accuracy and DVDcompliance with a P4 or AMD 64.
Donald
|
UncasMS_3
Member
|
3. October 2005 @ 05:23 |
Link to this message
|
Quote: (32 Bit SSEMMX iDCT (Skal)) is actually the fastest on A64 CPU's, and I found it to be the fastest on my P4 as well, but it's not quite as accurate as the 3 listed above.
fast it is, but saying it is not as accurate as the others is different to what is said here:
http://forum.doom9.org/showthread.php?p=682312#post682312
concerning manually setting sse2/mmx vs default
default has been slightly faster for me:
http://forums.afterdawn.com/thread_view.cfm/234309
second test was done with CCE
270 OPV matrix: angelverylow idct: default - encoding time 62 min
270 OPV matrix: angelverylow idct: sse2/mmx - encoding 63 min
athlon64 venice core
|
jdobbs
Senior Member
|
3. October 2005 @ 06:25 |
Link to this message
|
You have to be really careful in constructing quality tests for iDCT of MPEG. You have to do it from an original source that has no compression at all. I've seen instances where people use DVD VOBs as a source. You can't do that because you'd need to do some method of iDCT in order to get a picture against which to do the comparison.
This message has been edited since posting. Last time this message was edited on 3. October 2005 @ 06:27
|
brobear
Suspended permanently
|
3. October 2005 @ 06:29 |
Link to this message
|
Donald
Don't know what you were looking at. Like most of the time, with the integrated memory, the AMD did better with games. The Intel EE won out on the other benchmarks. So, the Intel will encode better and the AMD will play games. Still boils down to what one wants. If you're a gamer, go with AMD, if you want a computer go with Intel. I always figured PS2 and XBox were for games, but I'm sure the extreme gamers will disagree. LOL All these threads on video, I thought you might be using your PC for encoding. ;)
In case you have trouble reading them, the stock 3.2GHz Intel is running Whetstone iSSE2 - 13286 MFLOPS and overclocked at 3.52 the MFLOPS go to 14458. The corresponding figures for the Athlon 64 X2 4800+ are 9895 MFLOPS at 2.4 GHz stock and 11077 overclocked to 2.7 GHz. Just depends on what you want to brag about, gaming or working. ;)
Changed the capture so it could be read better. The FX 57 is the single core processor that gets all the bragging. Look at the low scores on that one compared to the AMD dual core and the Intel dual core.
Here's Sophocles' Venice Core for comparison. It doesn't stack up to the stock FX 57. LOL I know, it's a lot more expensive.
This message has been edited since posting. Last time this message was edited on 3. October 2005 @ 07:15
|
Staff Member
2 product reviews
|
3. October 2005 @ 06:45 |
Link to this message
|
Quote: fast it is, but saying it is not as accurate as the others is different to what is said here:
http://forum.doom9.org/showthread.php?p=682312#post682312
I should probably amend my earlier statement to say that it doesn't produce exactly the same results. Whether it's more or less accurate (or equally inaccurate in a different way) is another question that I can't answer without further testing.
|
AfterDawn Addict
1 product review
|
3. October 2005 @ 17:36 |
Link to this message
|
brobear
That's an old benchmark that was made when my CPU was running at only 2.45 GHZ, it's now more than 224 MHZ faster than that and my most recent scores to compare is 12,782 for Drhystone (a minor difference), 4387 (an insignificant difference) for Whetstone, and whetstone iSSE2 6982 (over a 1000 MFLOP faster a significant difference) I once broke 7000. The only reason I lose a little on the first two is because that CPU has a native clock speed of 2.6 GHZ and it's overclocked to 2.8 GHZ or an increase of 200 MHZ. The new FX57 sorts the San Diego core which is the same core as mine except mine has a smaller L2 Cache (i meg versus 512k). So you are in effect supporting my claims. My CPU is clocked at 2.664 GHZ which is an increase of 444 MHZ. Another point is that all of my bench marks were done with ZoneAlarm security suite, and Microsoft AntiSpyware running in the background. Another point to consider is that my Chip sells for $219.00 and the FX57 sells for over $1000. This weekend if I get a chance I'll post some new benchmarks.
Could you provide a link to the web site where you obtained the bench?
" Please Read!!! Post your questions only in This Thread or they will go unanswered:
Help with development of BD RB: Donations at: http://www.jdobbs.com/.
|
brobear
Suspended permanently
|
3. October 2005 @ 19:52 |
Link to this message
|
I conceded cost factors. However, both Intel and AMD are raising prices on their better processors. So the market is even there. I'm just glad there are "enthusiasts" who can benefit from some of their cheaper releases. I've been finding out that overclocking for speeds sake alone doesn't always achieve the desired results. Quote: The only reason I lose a little on the first two is because that CPU has a native clock speed of 2.6 GHZ and it's overclocked to 2.8 GHZ or an increase of 200 MHZ. -Sophocles-
Sometimes PCs become unstable when they're overclocked. A bit too much and the system actually loses performance as it gains speed. A good analogy would be a cargo jet that had to drop the cargo to go faster. That's the reason for doing the benchmarks, losing or gaining for any given reason is part of the count. Hope you don't fry your PC trying to get to the performance level of the FX 57. ;) I think we need to move this back to the hardware section. The Donald pokes and then you guys get flustered when I provide proof Intel is still in the game. LOL Here's the link: http://forums.afterdawn.com/thread_view.cfm/7/235934
|
AfterDawn Addict
1 product review
|
4. October 2005 @ 03:22 |
Link to this message
|
brobear
Systems can become unstable when one over clocks but every CPU of the same core has the same potential. Let's say that the Venice core was built for clock speeds at the bottom of 2.0 and at the top to 2.4. Since they have the exact same core they have the same potential. So over clocking the bottom 2.0 to 2.4 is not a stretch. In fact since the FX57 uses the same core but with more L2 cash and it runs at 2.6 then there's a good chance that the 2.0 Venice can match it. And with the increase in clock speed also comes increased encoding speeds.
" Please Read!!! Post your questions only in This Thread or they will go unanswered:
Help with development of BD RB: Donations at: http://www.jdobbs.com/.
|
64026402
Senior Member
|
4. October 2005 @ 03:44 |
Link to this message
|
Brobear,
Unfortunately for Intel one synthetic benchmark for their SSE2 performance does not help when loosing almost every other test. AMD has always been faster at SSE than SSE2. Sisoft has a note to the effect in the test area.
CCE belongs to AMD hands down, as well as gaming and folding and any real world uses one can think of. The only real world test that shows promise that I saw was DVDshrink was slightly faster for Intel when the EE was running at 3.2 and the AMD was at 2.4. They even lost that when both were OCed.
Donald
|
brobear
Suspended permanently
|
4. October 2005 @ 06:07 |
Link to this message
|
I'll ask the same thing that Sophocles asked of me. Where are the benchmarks to support your assumptions? I've seen several. Seems even the one where Comp Power Users did the skewed one with a gaming platform, Intel still came out ahead in the work category.
BTW, I feel sorry for the purchasers of the high end AMD processors. According to Sophocles they're getting seriously hosed. All they need do is buy the cheaper one. Even if they have to pay someone to do the overclocking, it would still be cheaper. That is if there aren't any differences. ;) I'm sure AMD would be more than happy to disagree that one can have the same level of performance with a cheaper Venice core than with their higher end FX 57 processor.
Just a little note for the reader that doesn't understand overclocking: Factory components and BIOS don't readily lend themselves to the process. After all, the retail manufacturer is looking for a stable platform that lasts. They're not into "racing" and designer PCs, except for promotional purposes. Overclocking is best done with aftermarket components designed for the independent builder or enthusiast. So don't think you can get into overclocking with your factory Sony, Gateway, Dell or whatever. Overclocking is mainly a pursuit left to enthusiasts that know what they're doing or willing to take the chance of blowing their system. This is about building designer PCs, nothing more.
I'm waiting to see Sophocles new benchmarks meet even the stock benchs on the FX. His Venice will be smoking out of the case if he tries to achieve the specs of the overclocked FX. Take pictures, not everyday you see a "mad scientist" (PC enthusiast) injecting liquid nitrogen into a PC case. LOL
Gentlemen. I believe you need to move your discussion back to the hardware section. As long as there is competition, there is going to be a disagreement over who has the best chip, who has the best setup, and who is going to win whatever contrived test is being set up. This is the last I have to say about the matter in threads off topic. Be more than glad to chat with you in the hardware section.
This message has been edited since posting. Last time this message was edited on 4. October 2005 @ 06:35
|
UncasMS_3
Member
|
5. October 2005 @ 02:19 |
Link to this message
|
following questions on and discussions about the impact iDCT has on the conversion speed, i decided to test all available iDCTs in rebuilder (pro):
i converted Van Helsing once more - 126 min main movie + 35 extras - on an athlon64 @ 3500, 1gb ram with CCE 270 in OPV mode
all conversion were done via batch-processing and of course the pc wasnt touched at all
idct
decoder default 66 min
32 bit mmx 67 min
32 bit sse/mmx 67 min
64 bit floating point 81 min
64 bit IEEE-1180 reference 83 min
32 bit sse2/mmx 67 min
32 bit ssemmx idct (skal) 64 min
32 bit simple mmx (xvid) 61 min
it is no wonder that the most precise modes (64bit) take longest but ~1/3 speed difference is quite a lot and i guess and wont ever use them until i got myself a dual-core 6ghz machine ;)
xvid idct being the winner is no surprise either - it is considered very fast but not as good/precise
according to rockas the default settings is:
Quote: ... the default value is SSE/MMX not SSE2/MMX... anyway...
SSE2/MMX is compatible with Pentium IV and AMD64...
SSE/MMX is compatible with those two plus Pentium III and Amd ATHLON.
jdobbs on: 32 bit ssemmx idct (skal):
Quote: The two SSE/MMX versions are just different algorithms that use the same precision levels, one was designed by SKAL.
this one is runner-up in terms of speed; the quality of which i cannot comment on as i wont spend time on something that will have to be made by means of image-comparison software and thus cannot be detected by the human eye
i'm a little bit puzzled as to my encoding time using DEFAULT as it does not match ANY other time - according to what i cited above default equals sse/mmx and thus it should have been 67 min not 66 but anyway :confuse5:
a little more surprising to me is the fact that sse2/mmx is NOT faster than mmx//sse/mmx which some people say it outperforms
taking into account that the default setting is #3 out of 8 when talking speed, i should think i will stay with this setting - it has provided good results before and it is not horribly slow compared to other settings
old dog, new tricks you know
of course results may differ with different cpus
|
UncasMS_3
Member
|
14. October 2005 @ 15:17 |
Link to this message
|
i have repeated the test with hank315's encoder and opv mode
final output size was also added this time
again i can clearly state that sse2 is NOT the fastest mode like some people claim every now and then
HC 016 TR-2
decoder default 111 min 4.434.924
32 bit mmx 113 min 4.434.924
32 bit sse mmx 113 min 4.435.880
64 bit floating point 137 min 4.437.452
64 bit IEEE-1180 reference 144 min 4.444.302
32 bit sse2/mmx 113 min 4.434.924
32 bit ssemmx idct (skal) 110 min 4.440.598
32 bit simple mmx (xvid) 111 min 4.433.900
|
AfterDawn Addict
1 product review
|
14. October 2005 @ 15:39 |
Link to this message
|
UncasMS_3
In all of your tests you're only finding a variance of 2 or 3 minutes across all settings which could be within a predictable error ratio (except for the 64 bit tests0. I'm betting that if you ran the same test over and over again that the results would vary from the previous results each and every time on each and every setting. The point that's being made is that if you have a CPU that is SSE2 compatible then if the software is also SSE2 compatible then one should see some speed gains. The problem here is that as far as I know only the DGDecode.dll can appreciate this advantage. It is also possible that CCE also can use SSE2 instructions.
If we were trying to compare dual core CPUs to single core CPUs we would run into a similar problem. CCE is a multi threaded application but the rest of the RB applications setup is not, so the question is how much would one benefit from a dual core system? It's hard to say but it's expected that the benefit will come from at least CCE. So if DGDecode.dll benefits from SSE2 which we know it does, then by how much?
I don't have the specs of your system but I'm assuming that it's a pretty good system because I don't think that you would be making a case on speed with a mediocre one. But that being said, if any of the applications in the RB setup can use SSE2 and SSE2 is known to increase speeds over its predecessors then that would be my choice.
" Please Read!!! Post your questions only in This Thread or they will go unanswered:
Help with development of BD RB: Donations at: http://www.jdobbs.com/.
|
UncasMS_3
Member
|
14. October 2005 @ 16:24 |
Link to this message
|
i have posted my main system specs in the very first posting
Quote: i converted Van Helsing once more - 126 min main movie + 35 extras - on an athlon64 @ 3500, 1gb ram with cce 270 in OPV mode
and i have actually repeated the testes randomly with certain idct
mmx + 64bit fp + sse2 + xvid have been run TWO times and the conversion times were exactly identical
i thus rule out any *variance* due to these double testings
+++++
concerning dual cores athlons: procoder2 as well as cce results have been posted @ doom9 already and they clearly show an impressive speed gain
This message has been edited since posting. Last time this message was edited on 14. October 2005 @ 23:47
|
AfterDawn Addict
1 product review
|
14. October 2005 @ 17:06 |
Link to this message
|
Quote: i have posted my main system specs in the very first posting
I must have missed them! Could you point me to them?
Quote: on an athlon64 @ 3500, 1gb ram
Which AMD 3500 core? I would also like to know your motherboard and memory make and specs. If you can add frontside bus settings, voltages used for memory and CPU. Memory timings would be helpful! Ie: CAS, RAS, cas to ras example. 2-2-2-5 T1/T2.
" Please Read!!! Post your questions only in This Thread or they will go unanswered:
Help with development of BD RB: Donations at: http://www.jdobbs.com/.
|
UncasMS_3
Member
|
14. October 2005 @ 23:58 |
Link to this message
|
i hope i didnt forget any important detail:
CPU Properties:
CPU Type AMD Athlon 64 3500+
CPU Alias Venice S939
CPU Stepping DH-E3
CPUID CPU Name AMD Athlon(tm) 64 Processor 3000+
CPUID Revision 00020FF0h
CPU Speed:
CPU Clock 2208.36 MHz
CPU Multiplier 9.0x
CPU FSB 245.37 MHz (original: 200 MHz, overclock: 23%)
Memory Bus 200.76 MHz
Voltage 1.4 V
CPU Cache:
L1 Code Cache 64 KB (Parity)
L1 Data Cache 64 KB (ECC)
L2 Cache 512 KB (On-Die, ECC, Full-Speed)
Motherboard Properties:
Motherboard ID 07/26/2005-NF-CK804-A8NSLI-B-00
Motherboard Name Asus A8N-SLI (3 PCI, 2 PCI-E x1, 2 PCI-E x16, 4 DDR DIMM, Audio, Gigabit LAN)
Chipset Properties:
Motherboard Chipset nVIDIA nForce4 SLI, AMD Hammer
Memory Timings 2.5-3-3-5 (CL-RCD-RP-RAS)
Command Rate (CR) 1T
SPD Memory Modules:
DIMM1 512 MB PC3200 DDR SDRAM (3.0-3-3-8 @ 200 MHz) (2.5-3-3-7 @ 166 MHz) (2.0-2-2-6 @ 133 MHz)
DIMM2 512 MB PC3200 DDR SDRAM (3.0-3-3-8 @ 200 MHz) (2.5-3-3-7 @ 166 MHz) (2.0-2-2-6 @ 133 MHz)
BIOS Properties:
System BIOS Date 07/26/05
Video BIOS Date 10/28/04
Award BIOS Type Phoenix - Award BIOS v6.00PG
Award BIOS Message ASUS A8N-SLI ACPI BIOS Revision 1011-002
DMI BIOS Version ASUS A8N-SLI ACPI BIOS Revision 1011-002
Graphics Processor Properties:
Video Adapter Gigabyte GeForce 6600 GT PCI-E
GPU Code Name NV43GT (PCI Express x16 10DE / 0140, Rev A2)
GPU Clock 501 MHz
Memory Clock 501 MHz
This message has been edited since posting. Last time this message was edited on 15. October 2005 @ 00:04
|
Staff Member
2 product reviews
|
15. October 2005 @ 08:47 |
Link to this message
|
@UncasMS_3
I've heard of others getting similar results with SSE vs SSE2 on A64 CPUs. My understanding is that it's the fact that A64's are optimized for SSE or SSE2 while the P4 is only really optomized for SSE2. IIRC even the last generation of PIII CPUs performs better (at an equivalent clock speed) for most operations. I imagine it has to do with the longer pipeline used in the P4 for non-SSE2 instructions.
This message has been edited since posting. Last time this message was edited on 15. October 2005 @ 08:48
|
Advertisement
|
|
|
AfterDawn Addict
1 product review
|
15. October 2005 @ 09:32 |
Link to this message
|
PIV's are supposed to be optimized for MMX, SSE, SSE2, and SSE3. You might have a point regarding Intel's longer pipes whch does affect a P4's overall speed. That's on of the reasons that the Northwood core ran a little faster and cooler than the Prescott core did.
You're also right about the speed of the PIII beating the P4 with equal clock speeds. The PIII architecture was never truly shelved, the core was shrunk and redesigned into the Pentium-M. Tomshardware.com placed a pentium M in a desktop overclocked it to around 2.5 GHZ and it beat everything in benchmarks. This of course a slightly older test and AMD has made some changes but it speaks columns regarding the P4 dead end.
http://www.tomshardware.com/cpu/20050525/index.html
http://www.tomshardware.com/cpu/20050525/pentium4-21.html
Rumors have it that Intel is looking at this core for future multi core CPU's. We think that dual core was a great advancement but rumors also reveal that Intel is looking to put 4 cores on a single chip.
" Please Read!!! Post your questions only in This Thread or they will go unanswered:
Help with development of BD RB: Donations at: http://www.jdobbs.com/.
|