[2002-01-12]

TalkTools v2

 

News

2002-01-12 - Well that was fun. This project died in a tragic disk-crash. Source lost, everything. The binaries below is all that's left.
2001-11-27 - pre-release.

Introduction

TalkTools v2 is a utility software suite in the IEFFHP family of tools. Version two contains a much updated version of the original TalkTool which was written and released many years ago when Baldur's Gate had just come out. While the original version was of dubious use for people not involved in the reverse-engineering process, version two contains tools useful for the [advanced] player.

TalkTools is a third-party product, not in any way supported, affiliated, sponsored or even recognized by Bioware and/or Black Isle. Never ever turn to them for support while using TalkTool modified files.

Purpose & Motivation

TalkTools is the result two things. a) An itch I have about a future where in CRPGs none of the character names are pre-determined. While the engine does not support this in the grandiose way I have it pictured, we can use TalkTools to simulate that behaviour. b) My need for something practical to work on while brushing up on my c++.

This is how it works; Using the tools in this suite you can extract from a dialog.tlk resource all the names of NPCs. Then you can use this list as input into a Markov model, which output you then replace back into the game - in effect - replacing every name in the game with new ones.

To get from here to there you must go through multiple steps, described below.

Overview

As I said, TalkTools is a suite of programs. Here's a quick overview, after which I'll explain how to use them to replace the names in an Infinity Engine based game:

talktool (not yet available)

Talktool can be used to manage the TALK resources of a game based off Bioware's Infinity Engine. You can use it to extract a dialog.tlk file to an XML file for editing, and then convert it back to a dialog.tlk again. You can also use it to append and remove ranges of entries in a dialog.tlk, which can be useful in developing and installing third party add-ons.

talkextract

Talkextract is a simple tool that given a dialog.tlk file will process it and output every unique word found to standard output. Use redirection to capture the output into a file.

subfiles

A linebased tool which will read two lists of words, and send to standard output every word in the second list, which does not exist in the first. In essence, you are subtracting a wordlist from another.

talkreplace

This utility is a search-and-replace tool working directly on a dialog.tlk file. Given a dialog.tlk and a file containing a list of search-string = replace-with entries, will rewrite the file after having done the substitutions. Note that it is case-sensitive (** see program options for more info).

markovgen

Markovgen implements an Order-2 Markov Chain Generator. A Markov Model is built using a source file, and the model can then be used to create new output conforming to the statistical properties of the original input. In short, you can use this to generate names.

listclean (not yet available)

Used to manipulate a list of words in different ways. You can change the case of words, filter words based on case and/or length, and more. Does nothing sed wouldn't do, but you might not have that.

But how do I ... ?

Say you, for some reason or another, want to try this name-changing concept out, what must you do? Well, what you need is really only one thing, a textfile mapping old names to new names, which can be run through talkreplace.

So you need old names and new names. Okay. Two possibilities; first, you could just create this file by hand by putting in mappings for your favourite (or maybe least favourite NPCs), or you can go automatic and try to change every name in the game.

For alternative one, just write the file and run talkreplace and you are done. The file could look like this:

Sarevok = Gates
Koveras = Setag
Imoen = Annae
Jaheira = Nynaeve
Khalid = G-G-Gwendolyn
Minsc = Steve
Boo = Marvin
Xzar = Licklidder
Montaron = Larry
Tiax = Dregen
Gorion = Ritalin

If you saved this as bg2replace.dat you could execute the replacements by issuing the command talkreplace dialog.tlk bg2replace.dat. Done. Enjoy.

For the second route, replacing every name, well... Unless you just happen to have played out every dialogue and remember every name you saw, then we have a small problem -- how to get just the names out of the dialog.tlk file. I toyed with several possible solutions, including hidden markov models and a simple syntactical heuristics, when I arrived the simple solution I chose.

First of all, all proper names will begin with a capital letter. So of all the text, if we throw away everything which doesn't start with a captial letter, we should have culled the set pretty hard.

This is implemented by first using talkextract to get out every unique word, and then using listclean** to remove every word not beginning with a capital letter.

The new list contains many non-names which stood in the beginning of sentences. The next step is to remove these from the set, reducing it further towards the "pure-name" state. We do this by subtracting from the list every word which occurs in a wordlist of our choice, preferably one not containing any names.

The list is now reduced to contain only names, sound-words and words not found in the wordslist (which happens mostly because they were misspelled or so domain specific that they were not covered by the wordlist at all). The last automatic thing we can do is to try and clean up the list by removing sound-words. Again applying listclean** we instruct it to remove words shorter than two characters and any word containing more capital than lowercase characters, and any word containing the same character three times in a row.

Having come this far you will have a short list (around 32Kb for Baldur's Gate 2) of possible names. All you have to do now is go through the list by hand, removing any entries which are not names.

Of course, this will only have to be done once per game. The quality of the list is somewhat important, so here's your chance to contribute to the community, by offering high quality name lists for download.

Having a list of names to replace, we move on to the new names. These you could either create yourself, or you can throw the dice by using the included markovgen program to generate them.

yet to be written ...

Download

You can get the the latest version here. Currently pre-release so no version numbers or source. The full release will contain the full source of all the tools, released under the GNU GPL.

Support

The tools in the TalkTool suite was built to be fairly generic. The ones operating on dialog.tlk will function on any valid TLK V1 resource file. Today that includes every version known to me of:

In addition, third-party add-ons should not pose a problem. However, some of the tools might not properly support languages other than English natively (changing/comparing case).

I will not, in general, supply support on how to use, modify or compile these tools. If you don't understand why or how, don't ask me. This software is supplied as is, without any warranties. Don't run these tools unless you have made backup copies of any source and output files you use. The only native infinity engine file touched by these tools is the dialog.tlk, so make sure you have a copy of it put away in a safe place. It is possible to break the game using these tools. If you do, do not blame Bioware or Black Isle.

If you experience crashes or other unwanted behaviour, clear the cache directory and run the game with the original dialog.tlk and see if that solves your problem. If you manage to break the game using well formed input into talktools, feel free to contact me so that I can locate the bug, if any.

©2001 Eddy L O Jansson. All rights reserved. All trademarks acknowledged.