-1

I have just started Python coding, after having some experience with scripting languages (BASH + 2 code-based programmes, SAC and FLAC). So I have reasonable understanding of basic code structure, loops and so on. My work so far consist mostly of reorganizing and shufling data bewteen various tables, looking up data from one table based on values from another and so on.

However, I am getting a bit overwhelmed by all the possible treatments of the data and 2D data in particular - lists of lists, numpy arrays, numpy record arrays and so on, each of them with different ways how to load them from a file, access them and modify them.

Do you know of a summary (preferably for dummies) of what are the possible data types and how to treat them, access them and swith between them?

If its google-able, then I haven't done it sufficiently and I appologise.

Cheers

Vhailor

Vhailor
  • 11
  • 1
  • Did you want to know about all 783000 of them, or only the most common? – Ignacio Vazquez-Abrams Mar 19 '15 at 00:08
  • It looks I have missed the last 3 of them... but ok, is there some summary of BASIC ways how to handle 2D data? Preferably those included in basic Python and Numpy? Perhaps I shall specify that I try to use if for a data analysis - processing of loads of time series (X-Y) data... if that helps – Vhailor Mar 19 '15 at 00:13

1 Answers1

2

There are three common array types I'll mention here: list and tuple, which are built-in and documented here (along with some others), and numpy.array.

List

Lists are built-in, mutable objects that can store lists, tuples, and numpy arrays. List literals are written with square brackets ([1,2,3,4]), and they can be indexed (starting from zero) with square brackets:

a = [1,2,3,4]
print a[1] # 2

Tuple

Tuples are like lists, but they are written with parentheses ((1,2,3,4)) and are immutable (they can't be modified), but they're faster with some operations than lists.

a = (1,2,3,4)
a[1] += 1 # raises a TypeError

You can convert from a tuple to a list by passing it as an argument to the built-in list() function, and you can convert the other way with tuple().

NumPy array objects

NumPy array objects are not built-in; they are part of NumPy. They're created with numpy.array(), which takes any iterable object (lists and tuples are iterable) and returns a NumPy array object with the same data:

import numpy as np
a = np.array([1,2,3,4])

NumPy arrays are implemented in C and probably faster, and NumPy implements a bunch of useful functions for manipulating them (documented in the docs I linked above).

About saving and loading them, I recently answered a question about saving NumPy arrays, and all of the methods I mentioned there will work with all three of these array types.

Community
  • 1
  • 1
KSFT
  • 1,774
  • 11
  • 17
  • 2
    An important point to make is that NumPy arrays are pretty much just C arrays at their core. Thus things that would be difficult/slow for C arrays are also slow for NumPy arrays, like appending rows (lists usually have O(1) appending). On the other hand, NumPy arrays have relatively little memory overhead from just raw data, and are also often able to be manipulated and viewed without making copies. – cge Mar 19 '15 at 00:31
  • Thanks for help! I know the crowd over here is quite interested in how fast the methods are. Honestly I am trying to find something that might take longer, but its easy to handle. I quite often have to do complicated searches through tables, generating combinations of data from multiple tables and then adding data to some other tables... and just recently I was quite surprised when I found out that the record array data type generated by np.genfromtxt is really handy for column-search, but quite pain when it comes to adding columns or extracting info from the headers. – Vhailor Mar 19 '15 at 03:59
  • @Vhailor I don't know much about how fast they are, but different each one is probably faster at different things, as cge mentioned. – KSFT Mar 19 '15 at 22:34