Thursday, November 4, 2010

Obtaining the name of a function in Haskell

Someone on #haskell asked whether you could recover a function's name from its value, i.e.

GHCi> name id
"id"
GHCi> name map
"map"

This is easy in some languages. But Haskell is not designed to provide this kind of run-time information. We'll need some non-portable hacks. I tested this code with GHC 6.12.1 on amd64 Linux; see below for portability notes.

How it works

Most values in GHC Haskell are represented by a closure. Closures representing functions are common in functional-language compilers, but GHC extends the idea to cover many other sorts of values.

Closures exist to store data. But the run-time system also needs operational information about each closure: how to garbage-collect it, how to force its evaluation, etc. This information is known at compile time and is shared between many closures. All algebraic values with the same constructor will share this information, as will all function values created from the same lambda in the program's source.

So each closure stores a pointer to an info table, holding this operational information. Info tables are generated at compile time, and stored as part of an executable's read-only data section. This means that they have statically-known addresses, with associated names in the executable's symbol table. We'll use these symbol names to name our functions.

We can dump an executable's symbol table with nm:

$ nm -f posix foo
...
ghczmprim_GHCziBool_Bool_closure_tbl D 0000000000749978 
ghczmprim_GHCziBool_False_closure D 0000000000749970 
ghczmprim_GHCziBool_False_static_info T 00000000004e4dd8 
ghczmprim_GHCziBool_True_closure D 0000000000749990 
ghczmprim_GHCziBool_True_static_info T 00000000004e4d80 
ghczmprim_GHCziDebug_debugErrLn1_closure D 00000000007499a0 
ghczmprim_GHCziDebug_debugErrLn1_info T 00000000004e4ec0 
...

Haskell identifiers can contain characters not allowed in symbol names. GHC uses a name-mangling scheme to build symbol names. For example, the first symbol above decodes to

ghc-prim_GHC.Bool_Bool_closure_tbl

Reading the symbols

We'll use vacuum to inspect heap objects. vacuum can do a lot of cool things, like visualize heap structures from a running Haskell program. We'll only use one of vacuum's functions here.

We'll also use the GHC API to un-mangle symbol names. GHC is a 20-year effort that has evolved alongside the Haskell language. It follows some legacy conventions like a mostly-flat module hierarchy. So the module we need is named simply Encoding.

import Data.Word
import Data.Maybe
import Text.Printf
import Control.Parallel ( pseq )

import qualified Data.Map           as Map
import qualified System.Posix.Files as Posix
import qualified System.Process     as Proc
import qualified Foreign.Ptr        as Ptr

import qualified GHC.Vacuum         as Vac
import qualified Encoding           as GHC

We invoke nm as a subprocess and parse its output:

type Symbols = Map.Map Word String

getSymbols :: IO Symbols
getSymbols = do
  exe <- Posix.readSymbolicLink "/proc/self/exe"
  out <- Proc.readProcess "nm" ["-f", "posix", exe] ""
  let offset = 0x10
  let f (sym:_:addr:_) = Just (read ("0x"++addr) - offset, GHC.zDecodeString sym)
      f _ = Nothing
  return . Map.fromList . catMaybes . map (f . words) . lines $ out

We're using the Linux proc filesystem to get a symbolic link to our application's executable.

The symbols in memory appear at an address 0x10 = 16 bytes or 2 machine words lower than in the executable's symbol table. I'm not sure why; perhaps it's because of GHC's "tables next to code" optimization.

Resolving a symbol

Once we have the symbol table, looking up a value is relatively easy:

name :: Symbols -> a -> String
name syms x = fromMaybe unk $ Map.lookup ptr syms where
  ptr = x `pseq` (fromIntegral . Ptr.ptrToWordPtr . Vac.getInfoPtr $ x)
  unk = printf "<unknown info table at 0x%016x>" ptr

We use vacuum to get the value's info table pointer, convert this to a Word, then look it up in the symbol table.

We explicitly evaluate x with pseq, to avoid seeing a thunk.

Testing it

We'll test with

$ ghc --make name.hs -package ghc
$ ./name

Each test below is commented with the expected output. First, let's try a few non-function values:

main :: IO ()
main = do
  syms <- getSymbols
  let test = putStrLn . name syms
  test 3            -- integer-gmp_GHC.Integer.Type_S#_con_info
  test (3 :: Int)   -- ghc-prim_GHC.Types_I#_static_info
  test "xyz"        -- ghc-prim_GHC.Types_:_con_info

GHC defaults to 3 :: Integer, as -Wall will tell you. As we see, Integer and Int are both implemented as algebraic data:

data Integer
  = S# Int#
  | J# Int# ByteArray#

data Int = I# Int#

The string "xyz" is a list built out of (:) cells.

Next let's try a few functions:

  test map          -- base_GHC.Base_map_info
  test getChar      -- base_System.IO_getChar_info
  test (+)          -- integer-gmp_GHC.Integer_plusInteger_info

Not bad! (+) defaults to operating on Integer, and GHC inlines the type class dictionary, giving us the underlying plusInteger function.

Now let's see the limits of this technique:

  test (\_ -> 'x')  -- s1jD_info
  test (const 'x')  -- stg_PAP_info
  test test         -- stg_PAP_info

Our lambda expression gets a useless compiler-generated name. The application of const is worse; it uses an info table common to all partial applications. However, we could use vacuum to follow the fields of the PAP closure, which I'll leave as an exercise to the reader. ;)

test itself is also a partial application. It's defined by applying two arguments to the function (.) defined as

(.) f g x = f (g x)

If we eta-expand test:

  let test x = putStrLn $ name syms x

then we'll get another compiler-generated name like s1mh_info.

Notes

This is a hack, and probably not suitable for any serious purpose. Shelling out to nm to get a symbol table is particularly ugly. I tried to use bindings to BFD, but ran into some segfaults.

The above code will work only on 64-bit machines, but could be adapted for 32-bit. I bet the magic offset would change. It works on GHC 6.12.1, and should work on other recent versions, if you can get vacuum to build.

It definitely requires a Unix system, and specifically Linux or something emulating Linux's proc filesystem. You'll need nm from GNU Binutils, which is standard on a system configured for C development.

It won't work if you run strip on your binaries...

32 comments:

Robert MassaioliNovember 4, 2010 at 8:16 PM
Nice hack but maybe the Haskell Prime team will see this and consider some changes to the next spec. Maybe.
ReplyDelete
Replies
TuringTestNovember 5, 2010 at 3:01 AM
I think you have a bug. "We explicitly evaluate x with seq, to avoid seeing a thunk." is incorrect.

The "x `seq` foo" does not promise that x is in WHNF before starting to evaluate foo. You need `pseq` instead.

See section 7.18.4 of the GHC user manual: "Note that we use pseq rather than seq. The two are almost equivalent, but differ in their runtime behaviour in a subtle way: seq can evaluate its arguments in either order, but pseq is required to evaluate its first argument before its second, which makes it more suitable for controlling the evaluation order in conjunction with par."
ReplyDelete
Replies
keeganNovember 5, 2010 at 12:08 PM
@TuringTest:

I was wondering about that. Since you've confirmed I'm not crazy, I'll change it.
ReplyDelete
Replies
edok69May 5, 2020 at 10:11 PM
This article is very good. We would like to leave a good article. To read more articles, click here >>>
www.bloglovin.com
5e68b4dadffdc.site123.me
babyedok.wixsite.com/mysite
manop1012.blogspot.com
medium.com/@babyedok
ReplyDelete
Replies
jamilOctober 24, 2020 at 3:02 AM
Such an informative article that you shared with us.
https://premiumcrack.com/traktor-pro-crack/
ReplyDelete
Replies
chaneeOctober 25, 2020 at 3:27 AM
I will be looking forward to your next post. Thank you
"เจ็บแต่จบ" ติ๊ก บิ๊กบราเธอร์ ตัดใจ
ReplyDelete
Replies
edok69October 25, 2020 at 6:10 AM
This is my blog. Click here.
เว็บแทงหวยออนไลน์ สมัครเลยตอนนี้ "
ReplyDelete
Replies
saniJune 5, 2021 at 2:42 PM
FL Studio Crack\
FL Studio Full Version receives further updates and fixes them automatically. The user should have the features they selected, fix bugs, continue to develop and update while we develop the FL Studio Reg Keys. It is clear to note that as soon as we start using them, we will feel their imagination and reaction. Additionally, we find that this software shows a person that they can do more with their fingers.
ReplyDelete
Replies
AnonymousFebruary 13, 2022 at 8:02 PM
อยากรู้ไหมทำไมถึงแนะนำให้ใช้บริการเดิมพันออนไลน์กับ 789bet ซึ่งเรามีหตุผลและคำแนะนำที่น่าสนใจ ที่เราอยากจะบอกคุณว่าทำไมถึงควรทำเงินกับการบริการของเราแนะนำว่า ที่นี่จัดบริการให้คุณครบแน่นอนเพราะบริการที่ทันสมัย พัฒนาการบริการตลอดเวลา และมีพันธมิตรเป็นคาสิโนดังหลายแห่งร่วมบริการ สมาชิกสามารถเล่นได้ทั้งวันพร้อมทั้งยังมีเทคโนโลยีสมัยใหม่อยู่ตลอดเวลา คลิกเข้าเล่นที่นี่ https://www.789betnow.com
ReplyDelete
Replies
ZonahobisayaMarch 22, 2022 at 1:58 AM
Gutt Websäit : Zonahobisaya
Gutt Websäit : Zonahobisaya
Gutt Websäit : Zonahobisaya
Gutt Websäit : Zonahobisaya
Gutt Websäit : Zonahobisaya
Gutt Websäit : Zonahobisaya
Gutt Websäit : Resep Masakan
Gutt Websäit : Zonahobisaya
ReplyDelete
Replies
Deepak sharmaJuly 6, 2022 at 11:30 PM
Great article,thanks for sharing with us!
ReplyDelete
Replies
AnonymousNovember 25, 2022 at 10:17 PM
영월출장샵
서귀포출장샵
계룡출장샵
당진출장샵
서귀포출장샵
경기도출장샵
경기도출장샵
충북출장샵

ReplyDelete
Replies
AnonymousDecember 3, 2022 at 10:09 PM
당진출장샵
총판출장샵
가평출장샵
수원출장샵
강원도출장샵
청주출장샵
충북출장샵
김포출장샵

ReplyDelete
Replies
pg slot asiaMarch 4, 2023 at 3:44 AM
pg slot asia สล็อต เป็นเว็บไซต์พนันสล็อตออนไลน์ ลำดับหนึ่ง ที่คนอีกหลายๆคนนั้นต้องเคยได้ฟังชื่อนี่กันมาบ้างแล้วเพราะเว็บไซต์ของพวกเรานั้น PG SLOT เปิดให้บริการโกรธยะหนึ่ง
ReplyDelete
Replies
king kong slotMarch 4, 2023 at 3:50 AM
king kong slot สามารถเล่นบนสมาร์โฟน ที่ได้ก็ได้ พวกเราเอาเกมยอดนิยม เกมสล็อตที่ โบนัสแตกบ่อยมากที่สุด pg เป็นผู้ให้บริการรองรับความต้องการของผู้เล่น ได้เป็นอย่างดี 24 ชม.
ReplyDelete
Replies
RoyceJuly 20, 2023 at 8:15 PM
This article give lot of inspiration
ReplyDelete
Replies
AmirahJuly 20, 2023 at 8:15 PM
Post more good contents pls
ReplyDelete
Replies
MoisesJuly 20, 2023 at 8:15 PM
This was an excellent page.
ReplyDelete
Replies
pgsoft pgslotAugust 14, 2023 at 8:40 PM
pgsoft pgslot เกมยอดฮิตรายปี 2023 ที่ได้รับเสียงตอบรับจากแฟนเกมสล็อตอย่างล้นหลาม pgslot ด้วยต้นแบบการเล่นที่ไม่ซ้ำใครและก็เข้าใจง่าย เล่นได้ทั้งยังในแนวดิ่งแล้วก็แนวระดับ
ReplyDelete
Replies
pg soft เครดิตฟรี50August 14, 2023 at 8:42 PM
pg soft เครดิตฟรี50 ในคู่มือฉบับสมบูรณ์นี้ เราจะสำรวจทุกสิ่งที่คุณจำเป็นต้องรู้เกี่ยวกับการรับเครดิตฟรี รวมถึงวิธีการทำงาน PG วิธีหาเครดิตฟรี และคำถามที่พบบ่อยคุณเป็นแฟนของเกมการพนันออนไลน์และคาสิโนหรือไม่
ReplyDelete
Replies
카지노사이트September 27, 2024 at 9:45 PM
Helpful info here. Please stay us updated like this. Thanks for sharing this
ReplyDelete
Replies
메이저사이트September 27, 2024 at 9:45 PM
Great site. Plenty of helpful information here. I am sending it to several buddies
ReplyDelete
Replies
토토사이트September 27, 2024 at 9:45 PM
Wow that was unusual. Just wanted to say wonderful blog! Keep it up
ReplyDelete
Replies
파워볼실시간September 27, 2024 at 9:45 PM
Thank goodness I found it. You’ve made my day! Thanks again for this information
ReplyDelete
Replies
AnonymousMarch 29, 2025 at 2:30 PM
Your writing is always so clear and concise. I always leave your blog feeling smarter. Thanks again!
ReplyDelete
Replies
신용카드 현금화April 14, 2025 at 7:24 PM
Thanks for sharing such a wonderful post.
ReplyDelete
Replies
토토사이트April 14, 2025 at 7:27 PM

Excellent and nice post. It will beneficial for everyone.
ReplyDelete
Replies
토토사이트April 14, 2025 at 7:31 PM
I must thank you for the efforts you’ve put in writing this blog.
ReplyDelete
Replies
토토사이트April 14, 2025 at 7:34 PM
I’ll oftimes be once more to learn to read much more, many thanks that information
ReplyDelete
Replies
대물카지노 가입코드April 14, 2025 at 7:39 PM

Now I fianlly found which I actually want.
ReplyDelete
Replies
엑스엑스벳 가입코드April 14, 2025 at 7:43 PM
Thanks for your excellent blog and giving great kind of information.
ReplyDelete
Replies
MeganrogerNovember 26, 2025 at 2:52 AM

you are really a good webmaster.
ReplyDelete
Replies

Add comment

main is usually a function