0

The code is exactly the same -- I copied it from one computer to another. The code is compiled with g++-4 (4.9.1) obtained from fink on OSX on both machines, and is not run in parallel.

Compiler options are "-O2", and the computers are basically doing nothing else (low CPU & memory usage). Code is a 2400-line research code link.

Machine 1:

  • Late 2013 MacBook Pro Retina,
  • 2.8 GHz i7-4558U,
  • 16GB 1600MHz DDR3,
  • 500GB Flash storage

Machine 2:

  • Late 2013 MacPro Workstation,
  • 3.5GHz 6-Core Intel Xeon E5-1650,
  • 32GB 1867MHz DDR3
  • 251GB Flash storage,
  • 3TB external SATA drive

Run-time:
Machine 1: with output 200 sec., w/o 18 sec.
Machine 2: (/ directory -- should be flash drive): with 2230 sec., w/o 2075 sec.
Machine 2: (~ directory -- should be external drive): with 2262 sec., w/o 2080 sec.

Any ideas of how to improve runtime on the MacPro?

Mokubai
  • 89,133
  • 25
  • 207
  • 233
Stershic
  • 23
  • 3
  • @Ramhound I was told it would be more relevant here than StackOverflow. – Stershic Aug 21 '14 at 17:34
  • 2
    This question seems extremely broad. What you will need to do to understand the differences is the following. Determine what part of the code takes the longest. You can then modify that code so its fast on both machines. If you get stuck doing that you can at that point ask the question on the correct website. – Ramhound Aug 21 '14 at 17:36
  • @Ramhound It was suggested that I use Instruments to profile my code, and I've been able to do that, but it's not clear to me if/how that shows what part of the code takes the longest. I'm not asking because I'm an expert - I'm asking because I'm not and I want to learn. – Stershic Aug 21 '14 at 17:40
  • 1
    **I can't even research the differences between the i7 and the E5 since I don't have specific model numbers.** My guess your code single threaded. Since the burst speed of the i7 is around 3.0 Ghz that means its unlikely the frequency of the CPU that causes the 115% increase to generate an output. – Ramhound Aug 21 '14 at 17:40
  • You have been able to do that. But you don't provide that information. You know how long the code takes. What function exactly is causing the delay? **Of course that question is a Stackoverflow question.** I am basically trying to tell you that your asking the wrong question. I should add that the following. You were not told this was a Superuser question. The close reason indicated it might be on topic here, its not, you were given better advice in the comments. – Ramhound Aug 21 '14 at 17:42
  • Run the code on Machine 2 with instrumentation and find out what's taking so long. It should be totally obvious. – David Schwartz Aug 21 '14 at 17:50
  • @ZippityBrosnan - how to use profiling tools certainly seems like an appropriate SO question – user2813274 Aug 21 '14 at 18:03
  • Thanks for your help @Ramhound & David. It takes some digging, but Instruments does tell the % of time taken by methods, and there was a glaring difference between the two. Now that it's down to one method, I'm sure I can figure it out from here. Thanks! – Stershic Aug 21 '14 at 18:03
  • @user2813274 - I would agree. The reason the original question was closed was because it was very broad and basically asking people to review 2k lines of code. The question wasn't updated, everything was in comments, and details were scarce. **I am not shock it was closed to be honest** – Ramhound Aug 21 '14 at 18:06
  • @Ramhound I absolutely didn't expect anyone to review the code. I wanted suggestions for how on earth code could take an order of magnitude difference time to run. "It shouldn't" + good use of profiling tools is the correct answer. Take it easy on a beginner – Stershic Aug 21 '14 at 18:18
  • @ZippityBrosnan - I expect people to ask on topic questions. **I don't see how asking how to profile code is a Superuser question.** – Ramhound Aug 21 '14 at 18:44
  • @Ramhound I didn't know that you _could_ profile code. Is that on topic? If not, please give me the pre-approved list of topics for SU. – Stershic Aug 21 '14 at 18:53

2 Answers2

1

This is a speculative guess, but your code works with the disk and disk I/O, and I am going to assume that this is your bottleneck - you mentioned that it runs faster on the machine with 500GB flash storage than on the one with 250 GB flash storage - this makes sense, logically, because of how flash storage is essentially a raid-0 of smaller (32/64gb) flash storage chips, and more chips/disks in a raid-0 array will greatly increase performance. I do not know the particular make/model/firmware/controller of the storage, however I suspect that if you were to do a disk I/O test, you would find a similar discrepancy in performance on the two machines. Such a performance test can best be done using XBench.

user2813274
  • 1,005
  • 2
  • 10
  • 26
  • @Ramhound: why would you revert my reference to XBench. As this answer suggests, the performance difference is probably caused by a the HDD speed. It is suggested to measure that speed and I added a link to a tool to test that. Why delete that? – agtoever Aug 21 '14 at 18:45
  • @agtoever - Its complicated. My bad – Ramhound Aug 21 '14 at 18:49
0

The proper way to approach the question "why does this code take so long to run", whether "long" is in absolute or relative terms, is to use a tool called a profiler.

Basically, you run the program through the profiler or with the profiler attached, and the profiler records how much time the program spends in various functions. This information is then presented to you in a form that allows you to pinpoint the parts of the program that took the longest to run during that execution. Often it will also be possible to get additional information from that report, such as which parts of the program are called the largest number of times, and things like that, which can also point toward areas that could use some scrutiny.

Based on that data, it's usually easy to tell which parts need to be optimized such that the program runs faster, without employing the guessing game known as "premature optimization" or relying on the particulars of some specific piece of hardware.

user
  • 29,449
  • 11
  • 99
  • 144