61

I want to build a Python function that calculates,

alt text

and would like to name my summation function Σ. In a similar fashion, would like to use Π for product, and so on. I was wondering if there was a way to name a python function in this fashion?

def Σ (..):
 ..
 ..

That is, does Python support unicode identifiers, and if so, could someone provide an example for it?

Thanks!


Original motivation for this was a piece of Clojure code I saw today that looks like,

(defn entropy [X]
      (* -1 (Σ [i X] (* (p i) (log (p i))))))

where Σ is a macro defined as,

(defmacro Σ
    ... )

and I thought that was pretty cool.


BTW, to address a couple of comments about readability - with a lot of stats/ML code for instance, being able to compose operations with symbols would be really helpful. (Especially for really complex integrals et al)

φ(z) = ∫(N(x|0,1,1), -∞, z)

vs

Phi(z) = integral(N(x|0,1,1), -inf, z)

or even just the lambda character for lambda()!

Community
  • 1
  • 1
viksit
  • 7,542
  • 9
  • 42
  • 54
  • 7
    Although not as cool, Python's summation function is pretty elegant: `sum()` – Nick Presta Apr 15 '10 at 23:04
  • agree. I meant more for other things here, like integrals, greek letters, et al. – viksit Apr 15 '10 at 23:22
  • 3
    Sounds like a horrible idea for ease of input (presumably $\sum$ wouldn't work, right?) – Benjamin Bannier Apr 15 '10 at 23:34
  • @honk - I'm guessing you can simply have a LaTeX map in vim or emacs or whatever to do the insertions for you when you type code in. – viksit Apr 16 '10 at 00:14
  • This is a terrible idea. If I have to copy and paste to call your functions, you've done something very wrong. – Glenn Maynard Apr 16 '10 at 01:43
  • 1
    Maybe you want to have a look at Fortress which allows Unicode and TeX style notation. –  Apr 16 '10 at 08:09
  • 7
    “Sounds like a horrible idea for ease of input” — depends what keyboard shortcuts you’ve got, doesn’t it? Curly quotes, like the kind I used at the start of this comment, are a bit of a drag to type by default in Windows (I believe), but have decent shortcuts on the Mac. If you do a lot of mathy programming, you could configure shortcuts to make the typing easy. – Paul D. Waite Apr 16 '10 at 09:30
  • @unbeknown, @Paul: My comment wasn't entirely serious, but more along the lines of Glenn's comment. – Benjamin Bannier Apr 22 '10 at 20:13
  • While `\sum` won't work, with a good setup (like TeX input mode in Emacs), `\Sigma` *would* work. – Tikhon Jelvis Jan 05 '13 at 02:10
  • Note that `ϕ` and `φ` are considered the same variable name in Python. :/ – endolith Jan 30 '18 at 15:53
  • ``` def (): pass ``` – Jasper Jun 12 '18 at 14:05
  • 1
    ϕ and φ are variants of the same symbol, so it makes sense to be the same identifier (specially when you're reading code out loud) – villasv Sep 26 '18 at 14:35
  • A discussion of implementing a plugin: https://www.reddit.com/r/Python/comments/clijjd/d_helper_plugin_espeically_for_statisticicsmletc/ :) – ch271828n Aug 04 '19 at 00:54

5 Answers5

50

(I think it’s pretty cool too, that might mean we’re geeks.)

You’re fine to do this with the code you have above in Python 3. (It works in my Python 3.1 interpreter at least.) See:

But in Python 2, identifiers can only be ASCII letters, numbers and underscores.

Paul D. Waite
  • 96,640
  • 56
  • 199
  • 270
29

It's worth pointing out that Python 3 does support Unicode identifiers, but only allows letter or number like symbols (see http://docs.python.org/3.3/reference/lexical_analysis.html#identifiers for full details). That's why Σ works (remember that it's a Greek letter, not just a math symbol), but √ doesn't.

For anyone interested, I made a website that lists every Unicode character that is valid in a Python variable https://www.asmeurer.com/python-unicode-variable-names/ (be warned that there are quite a lot of them, over 100000 in fact)

asmeurer
  • 86,894
  • 26
  • 169
  • 240
21

(this answer is meant to be a minor addendum not a complete answer)

The additional gotcha to unicode identifiers (which @mike-desimone mentions and I discovered quickly when I thought this was a cool thread and switched to a terminal to play with it), is the multiple versions of each glyph are not equivalent, with regards to how you get to each glyph on each platform. For example Σ (aka greek capital letter sigma, aka U+03A3, [can't find a direct mac input method]) is fine, but unfortunately ∑ (aka N-ary Summation, aka U+2211, aka opt/alt-w using Mac OS X) is not a valid identifier.

>>> Σ = 20
>>> Σ
20

but

>>> ∑ = 20
File "<input>", line 1
  ∑ = 20
  ^
SyntaxError: invalid character in identifier

Using Σ specifically (and probably unicode chars in general) as an identifier might generate some very hard to diagnose errors if you have multiple developers on multiple platforms contributing to your code, for example, debug this visually:

∑ looks very similar to Σ, depending on the typeface selected

The two glyphs are easier to differentiate on this page, but depending on the font used, this may not be the case.

Even the traceback isn't much clearer unless Σ is printed near the ∑

  File "~/Dev/play_python33/identifiers.py", line 12
    print(∑([2, 2, 2, 2, 2]))
            ^
SyntaxError: invalid character in identifier
Peter Hanley
  • 1,254
  • 1
  • 11
  • 19
16

According to is it bad, you can use some unicode characters, but not all: You are restricted to characters identified as letters.

>>> α = 3  
>>> Σ = sum   
>>> import math  
>>> √ = math.sqrt  
  File "<stdin>", line 1  
    √ = 3  
      ^  
SyntaxError: invalid character in identifier

Besides: I think it is very cool to be able to use unicode as identifiers - and I wish, i could use all.

I use the neo keyboard layout, which gives me greek and math symbols on extra layers:

αβχδεφγψιθκλνοπϕστ[&ωξυζ
∀⇐ℂΔ∃ΦΓΨ∫Λ⇔Σ∈ℚℝ∂⊂√∩Ξ

Community
  • 1
  • 1
Arne Babenhauserheide
  • 2,423
  • 29
  • 23
  • 3
    Also, there are often distinct versions of characters that are also Greek letters. For example, the Greek capital sigma is U+03A3, while the math sigma is U+1D6BA, U+1D6F4, U+1D72E, U+1D768, or U+1D7A2 depending on styling. Similarly, Greek capital omega is U+03A9, math omegas start at U+1D6C0, and the Ohms symbol is U+2126. – Mike DeSimone Jun 19 '14 at 12:23
  • 1
    Another nice way to enter most symbols is the compose key, e.g. on Windows via [WinCompose](https://github.com/SamHocevar/wincompose) – Tobias Kienzler Feb 23 '15 at 13:03
7

Python 2.x does not support unicode identifiers, and consequently does not support Σ as an identifier. Python 3.x does support unicode identifiers, although many people will get cross if they have to edit source files with, for example, identifiers A and Α (latin A and greek capital alpha.) Sigma is often readable enough, but still, not as readable as the word sigma, so why bother?

Thomas Wouters
  • 130,178
  • 23
  • 148
  • 122
  • 10
    I think readability of words versus symbols depends on context. When I’m reading something mathy, I find symbols (e.g. `x + y`) more readable than the wordy equivalents you’d get in, say, AppleScript (e.g. `add x to y`). Symbols are terser, and generally let you get by on shape recognition alone, which I think is easier on the brain than reading. I don’t do enough mathy stuff to have felt the need to add a sigma sign to my code though. – Paul D. Waite Apr 15 '10 at 23:05
  • Sure, there are plenty of cases where symbols are more readable than words. Or where non-ASCII characters express things better. I was mostly commenting on the fact that an identifier consisting of a single sigma isn't really an improvement over the word 'sigma' :) – Thomas Wouters Apr 15 '10 at 23:10
  • Just added an edit for this. Try composing something like, return (lambda x: N/(sigma * (2*numpy.pi)**.5) * numpy.e ** (-(x-mu)**2/(2 * sigma**2))) hehe. – viksit Apr 15 '10 at 23:21
  • 2
    That doesn't look any more readable with unicode identifiers to me. – Thomas Wouters Apr 15 '10 at 23:28
  • 3
    “That doesn't look any more readable with unicode identifiers to me.” — It does look more similar to the equation posted at the top of the question though. If someone was used to reading equations like that, mightn’t they find the symbol-y Python code more readable too? – Paul D. Waite Apr 16 '10 at 09:31
  • "many people will get cross if they have to edit source files with, for example, identifiers A and Α (latin A and greek capital alpha.)" I know I would. And I'm Greek. – Manos Dilaverakis Apr 16 '10 at 09:49
  • 3
    @Paul: sure, readability is always subjective. The audience is important. Which is why you need to consider the audience more than your own preferences. It's easy if you're always going to be your own entire audience, of course, but frequently things that start out that way end up in a wider distribution, and with a wider set of contributors. – Thomas Wouters Apr 16 '10 at 10:32
  • 4
    One place where Unicode identifiers will be nice is in iPython Notebook, because you can have variable names that are named the same as the variables they represent. For example, the variable representing a chip's thermal impedance from junction to ambient is θJA, and constantly writing it as `THETA_JA` makes it harder for non-programmers to read the code. – Mike DeSimone Jun 19 '14 at 12:29
  • @ThomasWouters My biggest problem is that "lambda" means something in Python, and is also a *very* common letter used in my field. It's irksome to have to maintain code where it's got non-standard caps, misspelled, etc. – Fomite Jul 08 '14 at 23:43
  • 1
    @viksit `return (λ x: N/(σ * (2 * np.π)**.5) * np.e ** (-(x-μ)**2/(2 * σ**2)))` looks much clearer to me - and it’s easy to type, because I have these letters on my keyboard (and if you’re german, so should you. See http://neo-layout.org) — note that this isn’t actually valid python, because python only reserves lambda, but not λ. – Arne Babenhauserheide Oct 18 '16 at 15:57
  • 1
    @ArneBabenhauserheide Actually I think the fact python only reserves lambda is a feature, not a bug, as you can use λ as an actual variable name then. – Imperishable Night Nov 24 '17 at 05:03
  • 1
    that’s possible, yes. – Arne Babenhauserheide Nov 24 '17 at 08:13