18 KiB
Text support
Acedia provides its own Text
/ MutableText
classes for working with text
that are supposed to replace string
variables as much as possible.
Main reasons to forgo string
in favor of custom text types are:
string
does not allow cheap access to either individual characters or codepoints, which makes computingstring
's hash too expensive;- Expanding
string
's functionality without introducing new types would require (for many cases) to disassemble it into codepoints and then to assemble it back for each transformation; - Established way of defining characters' color for
string
s is inconvenient to work with.
These issues can be resolved with our new text types: Text
and MutableText
,
whose only difference is their mutability.
NOTE:
Text
andMutableText
aren't yet in their finished state: using them is rather clunky compared to nativestring
s and both their interface and implementation can be improved. While they already provide some important benefits, Acedia's insistence on replacingstring
withText
is more motivated by its supposed future, rather than current, state.
string
Even if Text
/MutableText
are supposed to replace string
variables, they
still have to be used to either produce Text
/MutableText
instances or
to store their values in config files.
This means we have to cover how Acedia deals with string
s.
Colored vs plain strings
Colored strings are normal UnrealScript string
s that can contain
4-byte color changing sequences. Whenever some Acedia function takes
a colored string these color changing sequences are converted into formatting
information about color of its characters and are not treated
as separate symbols.
If you are unaware, 4-byte color changing sequences are defined as
<0x1b><red_byte><green_byte><blue_byte>
and they allow to color text that is being displayed by several native UnrealScript functions. For example,string
that is defined as"One word is colored" @ Chr(0x1b) $ Chr(1) $ Chr(255) $ Chr(1) $ "green"
will be output in game's console with its last word colored green. Red and blue bytes are taken as1
instead of0
because putting zero inside break thestring
.10
is another value that leads to unexpected results and should be avoided.
Plain strings are string
s for which all contents are treated as their own
symbols.
If you pass a string
with 4-byte color changing sequence to some method as
a plain string, these 4 bytes will also be treated as characters and
no color information will be extracted as a result.
Plain strings are generally handled faster than colored strings.
Formatted strings
Formatted string
s are Acedia's addition and allow to define color information
in a more human-readable way than colored strings.
To mark some part of a string
to have a particular color you need to enclose
it into curly braces {}
, specify color right after the opening brace
(without any spacing), then, after a single whitespace, must follow
the colored content.
For example, "Each of these will be colored appropriately: {#ff0000 red}, {#00ff00 green}, {#0000ff blue}!"
will correspond to a line
Each of these will be colored appropriately: red, green, blue!
and only three words representing colors will have any color defined for them.
Color can be specified not only in hex format, but in also in one of
the more readable ways: rgb(255,0,0)
, rgb(r=0,G=255,b=255)
,
rgba(r=45,g=167,b=32,a=200)
.
Or even using color aliases:
"Each of these will be colored appropriately: {$red red}, {$green green}, {$blue blue}!"
.
These formatting blocks can also be folded into each other:
"Here {$purple is mostly purple, but {$red some parts} are {$yellow different} color}."
with an arbitrary depth.
Conversion
Various types of string
s can be converted between each other by using
Text
class, but do note that formatted strings can contain more information
than colored strings (since latter cannot simply close the colored segment)
and both of them can contain more information than plain strings, so
such conversion can lead to information loss.
Examples of conversion:
local Text auxiliary;
auxiliary = _.text.FromFormattedString("{$gold Hello}, {$crimson world}!");
// Produces a string colored with 4-byte codes, a native way for UnrealScript
auxiliary.ToColoredString();
// Strings all color and produces "Hello, world!"
auxiliary.ToString();
// Don't forget the cleanup!
_.memory.Free(auxiliary);
Character
Character
describes a single symbol of a string and is a smallest text element
that can be returned from a string
by Acedia's methods.
It contains data about what symbol it represents and what color it has.
Character
can also be considered invalid, which means that it does not
represent any valid symbol. Validity can be checked with
_.text.IsValidCharacter()
method.
Character
is defined as a structure with public fields
(necessary for the implementation), but you should not access them directly
if you wish for your code to stay compatible with future versions of Acedia.
Formatting
Formatting describes how character should be displayed, which currently
corresponds to simply it's color (or the lack of it).
Formatting of a character can be accessed through
_.text.GetCharacterFormatting()
method and changed
with _.text.SetFormatting()
.
It is a structure that contains two public fields, which can be freely accessed
(unlike Character
's fields):
isColored
: defines whetherCharacter
is even colored.color
: color of theCharacter
. Only used ifisColored == true
.
Text
and MutableText
Text
is an AcediaObject
that must be appropriately allocated
(also deallocated) and is used by Acedia as substitute for a string
.
It's contents are immutable: you can expect that they will not change if you
pass a Text
as an argument to some method, although the whole object can
be deallocated.
MutableText
is a child class of a Text
that can change its own contents.
To create either of them you can use TextAPI
methods:
_.text.Empty()
to create empty mutable text,
_.text.FromString()
/ _.text.FromStringM()
to create immutable/mutable
text variants from a plain string
and their analogues
_.text.FromColoredString()
/ _.text.FromColoredStringM()
/
_.text.FromFormattedString()
/ _.text.FromFormattedStringM()
for colored and formatted string
s.
You can also get a string
back by calling either of
self.ToString()
/ self.ToColoredString()
/ self.ToFormattedString()
methods.
To duplicate Text
/ MutableText
themselves you can use Copy()
for immutable copies and MutableCopy()
for mutable ones.
Defining Text
/ MutableText
constants
The major drawback of Text
is how inconvenient it is to use it, compared to
simple string literals. It needs to be defined, allocated, used and
then deallocated:
local Text message;
message = _.text.FromString("Just some message to y'all!");
_.console.ForAll().WriteLine(message)
.FreeSelf(); // Freeing console writer
message.FreeSelf(); // Freeing message
which can lead to some boilerplate code. Unfortunately, at this moment not much
can be done about this boilerplate. An ideal way to work with text literals
right now is to create Text
instances with all the necessary text constants on
initialization and then use them:
class SomeClass extends AcediaObject;
var Text MESSAGE, SPECIAL;
protected function StaticConstructor()
{
default.MESSAGE = _.text.FromString("Just some message to y'all!");
default.SPECIAL = _.text.FromString("Only for special occasions!");
}
public final function DoSend()
{
_.console.ForAll().WriteLine(MESSAGE).FreeSelf();
}
public final function DoSendSpecial()
{
_.console.ForAll().WriteLine(SPECIAL).FreeSelf();
}
Acedia also pre-defines stringConstants
array that will be automatically
converted into an array of Text
s that can later be accessed by their indices
through the T()
method:
class SomeClass extends AcediaObject;
var int TMESSAGE, TSPECIAL;
public final function DoSend()
{
_.console.ForAll().WriteLine(T(TMESSAGE)).FreeSelf();
}
public final function DoSendSpecial()
{
_.console.ForAll().WriteLine(T(TSPECIAL)).FreeSelf();
}
defaultproperties
{
TMESSAGE = 0
stringConstants(0) = "Just some message to y'all!"
TSPECIAL = 1
stringConstants(1) = "Only for special occasions!"
}
This way of doing things is a bit more cumbersome, but is also safer in
the sense that T()
will automatically allocate a new Text
instance should
someone deallocate previous one:
local Text oldOne, newOne;
oldOne = T(TMESSAGE);
// `T()` returns the same instance of `Text`
TEST_ExpectTrue(oldOne == T(TMESSAGE))
// Until we deallocate it...
oldOne.FreeSelf();
// ...then it creates and returns newly allocated `Text` instance
newOne = T(TMESSAGE);
TEST_ExpectTrue(newOne.IsAllocated());
// This assertion *might* not actually be correct, since `newOne` can be
// just an `oldOne`, reallocated from the object pool.
// TEST_ExpectFalse(oldOne == newOne);
An easier way
While you should ideally define Text
constants, setting them up can
get annoying.
To alleviate this issue Acedia provides three more methods for quickly
converting string
s into Text
: P()
for plain string
s,
C()
for colored string
s and F()
for formatted string
s.
With them out SomeClass
can be rewritten as:
class SomeClass extends AcediaObject;
public final function DoSend()
{
_.console.ForAll().WriteLine(P("Just some message to y'all!")).FreeSelf();
}
public final function DoSendSpecial()
{
_.console.ForAll().WriteLine(P("Only for special occasions!")).FreeSelf();
}
They do not endlessly create Text
instances, since they cache and reuse
the ones they return for the same string
:
local Text firstInstance;
firstInstance = F("{$purple Some} {$red colored} {$yellow text}.");
// `F()` returns the same instance for the same `string`
TEST_ExpectTrue( firstInstance
== F("{$purple Some} {$red colored} {$yellow text}."));
// But not for different one
TEST_ExpectFalse(firstInstance == F("Some other string"));
// Still the same
TEST_ExpectTrue( firstInstance
== F("{$purple Some} {$red colored} {$yellow text}."));
Ideally one would at some point replace these calls with pre-defined constants,
but if you're using only a small amount of literals in your class,
then relying on them should be fine. However avoid using them for
an arbitrarily large amounts of string
s, since as cache's size grows,
these methods will become increasingly less efficient:
// The more you call this method with different arguments, the worse
// performance gets since `C()` has to look `string`s up in
// larger and larger cache.
public function DisplayIt(string message)
{
// This is bad, don't do this
_.console.ForAll().WriteLine(C(message)).FreeSelf();
}
Parsing
Acedia provides some parsing functionality through a Parser
class:
it must first be initialized by either Initialize()
or InitializeS()
method
(the only difference whether they take Text
or string
as a parameter)
and then it can parse passed contents by consuming its symbols from
the beginning to the end.
For that it provides a set of matcher methods that try to read certain values from the input. For example, following can parse a color, defined in a hex format:
local Parser parser;
local int redComponent, greenComponent, blueComponent;
parser = _.text.ParseString("#23a405");
parser.MatchS("#").MUnsignedInteger(redComponent, 16, 2)
.MUnsignedInteger(greenComponent, 16, 2)
.MUnsignedInteger(blueComponent, 16, 2);
// These should be correct values
TEST_ExpectTrue(redComponent == 35);
TEST_ExpectTrue(greenComponent == 164);
TEST_ExpectTrue(blueComponent == 5);
Here MatchS()
matches an exact string
constant and MUnsignedInteger()
matches an unsigned number (with base 16
) of length 2
, recording parsed
value into its first argument.
Another example of parsing a color in format rgb(123, 135, 2)
:
local Parser parser;
local int redComponent, greenComponent, blueComponent;
parser = _.text.ParseString("RGB( 123,135 , 2)");
parser.MatchS("rgb(", SCASE_INSENSITIVE).Skip()
.MInteger(redComponent).Skip().MatchS(",").Skip()
.MInteger(greenComponent).Skip().MatchS(",").Skip()
.MInteger(blueComponent).Skip().MatchS(")");
// These should be correct values
TEST_ExpectTrue(redComponent == 123);
TEST_ExpectTrue(greenComponent == 135);
TEST_ExpectTrue(blueComponent == 2);
TEST_ExpectTrue(parser.Ok());
where MInteger()
matches any decimal integer and then records that integer
into the first argument. Skip()
matches a sequence of whitespaces of
an arbitrary length, adding some these calls allows this code to parse colors
defined with spacings between numbers and other characters like
rgb( 12, 13 , 107 )
. Ok()
method simply confirms that all matching calls
so far have succeeded.
If you are unsure in which format the color was defined, then you can use
Parser
's methods for remembering/restoring a successful state:
you can first call parser.Confirm()
to record that all the parsing so far
was successful and should not be discarded, then try to parse hex color.
After that:
- If parsing was successful, -
parser.Ok()
check will returntrue
and you can callparser.Confirm()
again to mark this new state as one that shouldn't be discarded. - Otherwise you can call
parser.R()
to reset yourparser
to the state it was at the lastparser.Confirm()
call (or the initial state if noparser.Confirm()
calls were made) and try parsing the color in some other way.
local Parser parser;
local int redComponent, greenComponent, blueComponent;
...
// Suppose we've successfully parsed something and
// need to parse color in one of the two forms next,
// so we remember the current state
parser.Confirm(); // This won't do anything if `parser` has already failed
// Try parsing color in it's rgb-form;
// It's not a major issue to have this many calls before checking for success,
// since once one of them has failed - others won't even try to do anything.
parser.MatchS("rgb(", SCASE_INSENSITIVE).Skip()
.MInteger(redComponent).Skip().MatchS(",").Skip()
.MInteger(greenComponent).Skip().MatchS(",").Skip()
.MInteger(blueComponent).Skip().MatchS(")");
// If we've failed - try hex representation
if (!parser.Ok())
{
parser.R().MatchS("#")
.MUnsignedInteger(redComponent, 16, 2)
.MUnsignedInteger(greenComponent, 16, 2)
.MUnsignedInteger(blueComponent, 16, 2);
}
// It's fine to call `Confirm()` without checking for success,
// since it won't do anything for a parser in a failed state
parser.Confirm();
You can store even more different parser states with
GetCurrentState()
/RestoreState()
methods. In fact, these are the ones used inside a lot of Acedia's methods to avoid changing mainParser
's state that user can rely on.
For more details and examples see the source code of Parser.uc
or
any Acedia source code that uses Parser
s.
JSON support
NOTE: This section is closely linked with Collections.
Acedia's text capabilities also provide limited JSON support. That is, Acedia can display some of it's types as JSON and parse any valid JSON into its types/collections, but it does not guarantee verification of whether parsed JSON is valid and can also accept some technically invalid JSON.
Main methods for these tasks are _.json.Print()
/_.json.PrettyPrint()
and
_.json.ParseWith()
, but there are some more type-specialized methods as well.
Here are the current rules of conversion from JSON to Acedia's types via
_.json.ParseWith()
:
- Null values will be returned as
none
; - Number values will be return as an
IntBox
/IntRef
if they consist of only digits (and optionally a sign) andFloatBox
/FloatRef
otherwise. Choice between box and ref is made based onparseAsMutable
parameter (boxes are immutable, refs are mutable); - String values will be parsed as
Text
/MutableText
, based onparseAsMutable
parameter; - Array values will be parsed as a
DynamicArray
, it's items parsed according to these rules (parseAsMutable
parameter is propagated). - Object values will be parsed as a
AssociativeArray
, it's items parsed according to these rules (parseAsMutable
parameter is propagated) and recorded under the keys parsed intoText
.
And printing with _.json.Print()
/_.json.PrettyPrint()
follows
symmetrical rules:
none
is printed into "null";- Boolean types (
BoolBox
/BoolRef
) are printed into JSON bool value; - Integer (
IntBox
/IntRef
) and float (FloatBox
/FloatRef
) types are printed into JSON number value; Text
andMutableText
are printed into JSON string value;DynamicArray
is printed into JSON array withPrint()
method applied to each of its items. If some of them have not printable types - "none" will be used for them as a replacement.AssociativeArray
is printed into JSON object withPrint()
method applied to each of it's items. Only items withText
keys are printed, the rest is omitted. If some of them have not printable types - "none" will be used for them as a replacement.
The difference between _.json.Print()
and _.json.PrettyPrint()
is that
_.json.Print()
prints out a minimal, compact json, while
_.json.PrettyPrint()
prints a more human-readable JSON with indentation and
color highlights.