Saturday, 18 April 2009

Developer of LOVE on How to Learn to Program

I find myself agreeing with Eskil Steenburg's advice on how to learn to program:

I almost never use linked lists (I think there is one place in all of LOVE), and instead I try to use arrays, they are faster, smaller and more cash coherent.
[Edit: Reading the article, I believe Eskil is primarily referring to linked lists where each member of the list is dynamically allocated and pointer referenced. It's possible to do an array based implementation of linked lists which avoids the issues highlighted in the comments, and then to infrequently expand the array using realloc in the event the array fills up. Angband uses the same approach, but usually has a hard limit on the size of each array, instead of dynamically expanding it].

10 comments:

Jonathan S. Fox said...

I find myself saying "Just use the right tool for the job at hand!"

Arrays and linked lists have partially overlapping abilities, but that doesn't mean they're designed to be used for the same things. Linked are better at some things and worse at others. I personally rarely have need of them, but it's not a bias against lists so much as just rarely needing them.

Anywhere you're going to be doing a lot of inserting or deleting into the middle of an array, doing so will require shifting all of the later elements in the array. Inserting and deleting into a linked list takes constant time, while doing the same to an array takes linear time. On the other hand, finding things in a list takes linear time, since the list needs to be walked -- if that's a problem, and you need the advantages of both lists and arrays, you can start looking at trees and other more advanced data structures that will try to give you the best of all worlds.

If I was worried about cache misses, I'd be sooner inclined to write a small memory manager than abandon linked data structures -- but I'm usually not worried about cache misses. If you're performing an algorithm on a large set of data, then keeping that data contiguous in memory is great. Otherwise, it's just not much of an issue most of the time.

Performance in a program is like a stack of money. It exists only so that you can spend it on other things, like algorithms, graphics, and content. People think they love performance, but in reality, nobody ever notices if you have more than "enough".

If a roguelike is slamming your computer, it's usually in one or two specific places, like your line of sight or dungeon generation. When this happens, I'm willing to bet it's because of the algorithmic complexity of your operations, and not because you didn't optimize for the cache.

Pod said...

Lists are invaluable tools and ignoring them altogether is kinda foolish.

Also, unless you're writing a high performance game, I think caring about cache hits is quite silly. You'll find you get more of a performance boost by profiling the game and rearchitecting the offending areas.

Andrew Doull said...

Reading the article, I believe Eskil is primarily referring to linked lists where each member of the list is dynamically allocated and pointer referenced. It's possible to do an array based implementation of linked lists which avoids the issues highlighted in the comments, and then to infrequently expand the array using realloc in the event the array fills up. Angband uses the same approach, but usually has a hard limit on the size of each array, instead of dynamically expanding it

Nimrod Perez said...

I agree that arrays are useful, but lists have a few unique features that are unmatched by arrays.

I'm a huge double linked lists fan!

Jotaf said...

Oh. My. God.

That article contains some of the worst pieces of advice I've read in a long time. Some teachers I've had swear by them. It's the reason some final projects I've seen were done in C++ and OpenGL; using extremely rigid structures; replacing proper math with hideous hacks; and constantly reinventing the wheel. It also represents a good proportion of the unfinished amateur game projects you see on the net.

I've been there, I've done that, it was painful and I didn't learn much. Please don't give that sort of advice to novices!

fu said...

Yes he offers some advice that is dubious to most veteran programmers, but honestly one of the best ways to learn is to do things in a suboptimal way sometimes.

Lists vs arrays for example - you really never get a feel for what they're good/bad at until you actually use them, and find out for yourself where stuff starts to break down.

Pod said...

He doesn't advocate trying them for yourself, he simply disuades anyone for using them. I think a lot of it is very bad advice, especially things such as "I try to avoid using pointers, they result in cache misses".

:/

Brendan MacDonell said...
This comment has been removed by the author.
Brendan MacDonell said...

Jonathan S. Fox said: "... I'm willing to bet it's because of the algorithmic complexity of your operations, and not because you didn't optimize for the cache."
Actually, because accessing main memory is an order of magnitude slower than hitting the L1 cache, cache locality is especially important in algorithms for which N is large, while cache misses are trivially overlooked in small-N situations.

Andrew Doull said: "It's possible to do an array based implementation of linked lists which avoids the issues highlighted in the comments, and then to infrequently expand the array using realloc in the event the array fills up."
Are you talking about allocating all of the list elements in a given section of memory (which doesn't help cache performance) or re-implementing a dynamic array / vector?

Pod said...

http://www.google.com/search?hl=en&client=firefox-a&rls=org.mozilla%3Aen-GB%3Aofficial&hs=EKk&q=array+based+linked+list&btnG=Search

He means thus^.
It's similiar to storing a binary tree in an array.