MicroPython ‘bytes’ Data Type

Contents

Introduction

MicroPython is a subset of the parent Python language and is primarily used on microcontroller boards with constrained processor speeds and memory capacity. This tutorial describes the bytes data type for the MicroPython port that runs on the BBC micro:bit.

MicroPython bytes are a sequence of integers in the range of 0 to 255 or in binary notation; 0b00000000 to 0b11111111. This means that each element of the sequence is exactly 1 byte (8 bits) in size. They are an immutable sequence meaning that once a bytes object is created it cannot be changed - same as for the data types string and tuple.

Budding programmers are often confused when first confronted with the bytes (and bytearray) data types. In reality though they are a very simple concept and can be incredibly useful for microcontroller programming. The trick is to think of them for what they are: simply a collection of 8-bit bytes. Nothing more, nothing less.

What really can add to the confusion is the way MicroPython displays bytes and bytearrays - this is discussed in detail below.

Basic bytes Operations

This section will describe how bytes are displayed, declared, accessed (indexing and slicing) and looped through.

Displaying a bytes Object

As stated above a bytes object is a container for a collection of 8-bit bytes. The elements of the bytes object are simply stored as the byte representation in memory. However MicroPython, at first glance, has a somewhat confusing way of displaying (e.g. with the print() function) the object.

A bytes object is displayed in the following manner:


b'string'

Where:
b: The prefix character 'b' indicates
   a bytes object.

string: A string that conveys the contents
        of the bytes object.

Example:
b'\x01\nA'
          

Consider the string '\x01\nA' from the example above. It is this that often leads to confusion. However there are a simple set of rules that define the actual byte data the bytes string represents.

  1. The string represents a collection of ordered bytes with no spaces or other separators between them. The string '\x01\nA' represents three byte values; \x01, \n and A.
  2. If the byte value is an ascii code for a printable character then that character is displayed e.g. if the value is 65 (decimal) than the character 'A' will display.
  3. MicroPython recognises a small set of special characters known as escape characters. An escape character is prefixed with a backslash (\) followed by a printable character. For example, \n is commonly used in a print() statement to force a newline.

    The list of MicroPython escape characters are described in Table 1. If a bytes element value is the ascii code of one of these escape characters then the escape character representation will be displayed e.g. the byte 12 will display as \f (i.e. a formfeed).

  4. And finally… any remaining value (i.e. neither a printable character or an escape character) is displayed as a hexadecimal value prefixed by \x. Thus a byte value of 143 (decimal) would be display as \x8f.
Table 1: MicroPython Escape Characters
Code Description ascii
\' single quotation 39
\\ Backslash 92
\n New line 10
\r Carriage return 13
\t Tab 9
\b Backspace 8
\f Form feed 12

The following table gives examples of these rules in application. The Byte Values column is the byte value (shown in decimal) that is stored in memory for that element.

String Elements Byte Values
'\x01\nA' \x01, \n, A 1, 10, 65
'\x8f\n\x1d' \x8f, \n, x1d 143, 10, 29
'\x01\x02\x03\x04' \x01, \x02, \x03, \x04 1, 2, 3, 4
'Hello' 'H', 'e', 'l', 'l', 'o' 72, 101, 108, 108, 111

Declaring

A bytes object is declared using the bytes() function.

Syntax


variable_name = bytes([source[,encoding]])

Where:
variable_name: Name assigned to the bytes object.

source: Optional, one or more values
        that must be able to be evaluated
        to a numerical value between 0..255.

encoding[1]: Optional, only required if a string
            is passed as a source value.

Examples:
# An empty bytes object
b1 = bytes()
⇒ b''

# bytes object with two elements
# automatically populated with 0's.
b1 = bytes(2)
⇒ b'\x00\x00'

# bytes object populated from a list
bytes([1, 10, 65])
⇒ b'\x01\nA'

# An bytes object populated from a string.
# The encoding argument is required
# else an error occurs.
bytes('ABC', 'utf-8')
⇒ b'ABC'

# An bytes object populated from a string
# without the  encoding argument
# causes an error.
b1 = bytes('ABC')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: wrong number of arguments
          

The main thing to note however when converting a list or tuple to a bytes object is all the elements of the list or tuple must be numerical and must have values between 0 and 255 otherwise an error will occur.

Indexing

Each element of a bytes object can be accessed through its index. Similar to all other MicroPython objects that support indexing (string, list, tuple, etc.) the first element of a bytes object has an index of 0 (zero).

The value returned by using an element's index is the true underlying 8-bit byte value not the byte string representation that is displayed with the (e.g.) print() function. The following example should help clarify this important point.

Example 1

# Examples of accessing elements of a
# bytes object through their indexes.

b1 = bytes([10, 50, 90, 140])
print(b1)
print(b1[0],b1[1], b1[2], b1[3], '\n')

print('Accessing elements in reverse order...')
print(b1[-1],b1[-2], b1[-3], b1[-4])

          
Output:

b'\n2Z\x8c'
10 50 90 140 

Accessing elements in reverse order...
140 90 50 10
          

This example shows that the index operation returns the actual underlying byte value i.e. 10, 50, 90 140. The byte string displayed by the print(b1) statement looks far more intimidating but is easily broken down and understood by following the rules discussed previously.

Since there are four byte values in the b1 collection, the byte string '\n2Z\x8c' can be broken down into the four components: '\n', '2', 'Z' and '\x8c' where:


'\n' = ascii(10) = 'newline' escape character
'2' = ascii(50)
'Z' = ascii(90)
'\x8c' = 8c (hexadecimal) = 140 (decimal)
          

Example 1 also shows how elements can be accessed in reverse order by using negative indexes.

The bytes data type is immutable meaning that once an object is declared it cannot be changed. Any attempt to do so will result in an exception (error) being raised as shown in the next example.

Example 2

# Demonstrates that a bytes object is
# immutable i.e. cannot be changed 
# once declared.

l = [1, 2, 3, 4, 5]
b = bytes(l)
print(l, '\n', b)

# Attempting to update the bytes 
# object 'b' will result in an error.
print('First element of b:', b[0])
print('Attempting to update b...')
b[0] = 20
print(b[0])
        
Output:

[1, 2, 3, 4, 5] 
 b'\x01\x02\x03\x04\x05'
First element of b: 1
Attempting to update b...
Traceback (most recent call last):
  File "main.py", line 13, in <module>
TypeError: 'bytes' object doesn't support item assignment
        

Slicing

To access a specific range of elements inside the bytes object, the slicing operator, which is a colon ‘:’ is used.

If only a single number is supplied with the slicing operator the first n values of the bytes object will be returned where n is the supplied index. For example b1[:3] will return the first three elements of the bytes object b1.

If two numbers are supplied with the slicing operator then all elements starting from the first index up to but not including the second index will be returned.

Negative indexes can also be used with the slicing operator allowing access from the end of the bytes object.

Example 3

# Demonstrates the use of the slicing
# operator ':' on a bytes object.

# Define & populate a bytes object.
l = [9, 5, 9, 2, 4, 8]
b = bytes(l)
print('l:', l)
print('b:', b)

# Use the slicing operator with one index.
print('b[:3] =', b[:3])

# Use the slicing operator with two indexes.
print('b[2:5] =', b[2:5])

# Use the slicing operator
# with negative indexes.
print('b[-5:-2] =', b[-5:-2])
          
Output:

l: [9, 5, 9, 2, 4, 8]
b: b'\t\x05\t\x02\x04\x08'
b[:3] = b'\t\x05\t'
b[2:5] = b'\t\x02\x04'
b[-5:-2] = b'\x05\t\x02'
          

Note: The slicing operation on a bytes object always returns another bytes object which will be a subset of the original object.

Looping through a bytes Object

It is a simple process to access each element of a bytes object using a for loop.

Example 4

# Demonstrates how to access each element
# of a bytes object using a 'for' loop.


# Define a bytes object.
l = [9, 5, 9, 2, 4, 8]
b = bytes(l)
print('l:', l)
print('b:', b)

# looping through the bytes object.
for byte in b:
    print(byte, end=' ')
    
print()
          
Output:

l: [9, 5, 9, 2, 4, 8]
b: b'\t\x05\t\x02\x04\x08'
9 5 9 2 4 8
          

Converting Between bytes and Other Data Types

MicroPython is very flexible with the ability to readily convert between most data types where it makes sense to be able to do so.

One of the conversions in Example 5 is between bytes and array. Arrays are a very useful collections type and are defined in the uarray module. A tutorial covering MicroPython arrays can be found here.

Example 5

# Demonstrates how to convert between the
# bytes data type and other MicroPython
# data types.

import uarray

# Define a bytes object with
# a bytes string.
b = b'\t\x05\t\x02\x04\x08'
print('bytes:', b)

# Convert bytes to list
l = list(b)
print('list:', l)

# Convert bytes to tuple
t = tuple(b)
print('tuple:', t)

# Convert bytes to set
s = set(b)
print('set:', s)

# Convert bytes to array
s = uarray.array('B', b)
print('array:', s)

# Convert bytes to string
B = bytes('Hello World', 'ascii')
Str = B.decode()
print('\nbytes:', B)
print('string:', Str)
          
Output:

bytes: b'\t\x05\t\x02\x04\x08'
list: [9, 5, 9, 2, 4, 8]
tuple: (9, 5, 9, 2, 4, 8)
set: {8, 2, 9, 4, 5}
array: array('B', [9, 5, 9, 2, 4, 8])

bytes: b'Hello World'
string: Hello World
          

Note: The above example uses the decode() method to convert a byte string to a string.

bytes Operators

Table 1: MicroPython bytes Operators
Operator Description
+

Concatenation; creates a new bytes object by inserting the second bytes object to the end of the first.

b1 = bytes([1, 2])
b2 = bytes([3, 4])
b1 + b2
⇒ b'\x01\x02\x03\x04'

*

Repeats a bytes object a given number of times, creating a new bytes object.

b1 = bytes([1, 2])
b1 * 3
⇒ b'\x01\x02\x01\x02\x01\x02'

==

Comparison operator tests whether two bytes objects are equal

b1 = bytes([1, 2])
b2 = bytes([2, 1])
b3 = bytes([1, 2])

(b1 == b3) ⇒ True
(b1 == b2) ⇒ False

!=

Comparison operator tests whether two bytes objects are not equal

b1 = bytes([1, 2])
b2 = bytes([2, 1])
b3 = bytes([1, 2])

(b1 != b3) ⇒ False
(b1 != b2) ⇒ True

in

Membership operator

b1 = bytes([1, 2, 3, 4, 5])
b2 = bytes([2, 3, 4])
b3 = bytes([2, 3, 5])
b4 = bytes([4, 5, 6])

b2 in b1 ⇒ True
b3 in b1 ⇒ False
b4 in b1 ⇒ False

not in

Membership operator

b1 = bytes([1, 2, 3, 4, 5])
b2 = bytes([2, 3, 4])
b3 = bytes([2, 3, 5])
b4 = bytes([4, 5, 6])

b2 not in b1 ⇒ False
b3 not in b1 ⇒ True
b4 not in b1 ⇒ True

bytes Functions

Table 2: MicroPython bytes Functions
Function Description
len(bytes)

Returns the number of elements in a bytes object

b = bytes([2, 4, 1, 5, 3])
len(b) ⇒ 5

max(bytes)

Returns the maximum value in a bytes object.

b = bytes([2, 4, 1, 5, 3])
max(b) ⇒ 5

min(array)

Returns the minimum value in a bytes object.

b = bytes([2, 4, 1, 5, 3])
min(b) ⇒ 1

sorted(array
      [, reverse =
         True|False])

Returns a sorted list of the elements from the bytes object.

b = bytes([2, 4, 1, 5, 3])

sorted(b)
⇒ [1, 2, 3, 4, 5]

sorted(b, reverse = True)
⇒ [5, 4, 3, 2, 1]

bytes Methods

A method, like a function, is a set of instructions that perform a task. The difference is that a method is associated with an object, while a function is not.” [codecademy.com]

This series on MicroPython (for micro:bit) discusses classes and methods here.

Methods are invoked using dot notation i.e. bytes.method() with the following tables providing simple examples. The tables provides an exhaustive list of the MicroPython for micro:bit bytes methods organised by classification.

Case Conversion

Table 3: MicroPython bytes Case Conversion Methods
Method Description
lower()

Converts to lower case

b'TrEe 78'.lower() ⇒ b'tree 78'

upper()

Converts to upper case

b'tReE 78'.upper() ⇒ b'TREE 78'

Find and Seek

Table 4: MicroPython Find and Seek bytes Methods
Method Description
count(sub
      [, start
      [, end]])

Counts instances of a substring in a byte string

b1 = bytes('spam ham am', 'ascii')
b1.count(b'am') ⇒ 3
b1.count(b'am', 1, 5)
⇒ 1

endswith(sub)

Returns True if byte string ends with the substring

b1 = bytes('spam ham am', 'ascii')
b1.endswith('am')
⇒ True

startswith(sub)

Returns True if byte string starts with the substring

b1 = bytes('spam ham', 'ascii')
b1.startswith(b'spam')
⇒ True

find(sub
      [, start
      [, end]])

Finds first occurrence of substring. Returns -1 if not found

b1 = bytes('spam ham am', 'ascii')
b1.find(b'am')
⇒ 2
b1.find(b'xx')
⇒ -1
b1.find(b'am',3,8)
⇒ 6

rfind(sub
      [, start
      [, end]])

Finds last occurrence of substring. Returns -1 if not found.

b1 = bytes('spam ham am', 'ascii')
b1.rfind(b'am')
⇒ 9
b1.rfind(b'am',3,8)
⇒ 6

index(sub
      [, start
      [, end]])

Finds first occurrence of substring. Raises an exception if the value is not found.

b1 = bytes('spam ham am', 'ascii')
b1.index(b'am') ⇒ 2
b1.index(b'xx')
⇒ substring not found
b1.index(b'am',5,12) ⇒ 6

rindex(sub
      [, start
      [, end]])

Finds last occurrence of substring. Raises an exception if the value is not found.

b1 = bytes('spam ham am', 'ascii')
b1.rindex(b'am') ⇒ 9
b1.rindex(b'xx')
⇒ substring not found
b1.rindex(b'am',5,8) ⇒ 6

Character Classification

Table 5: MicroPython bytes Character Classification Methods
Method Description
isalpha()

Checks if all characters in byte string are letters

b'spam ham clam'.isalpha()
⇒ False
b'spamhamclam'.isalpha()
⇒ True

isdigit()

Checks if all characters in byte string are digits

b'spam ham clam'.isdigit() ⇒ False
b'123'.isdigit() ⇒ True
b'123.5'.isdigit() ⇒ False
b''.isdigit() ⇒ False

isspace()

Checks if all characters in byte string are whitespace

b' '.isspace() ⇒ True
b'A1B2 C3'.isspace() ⇒ False

islower()

Checks if all letter characters in byte string are lowercase

b'spam ham clam 789'.islower() ⇒ True
b'Spam Ham clam 789'.islower() ⇒ False
b''.islower() ⇒ False

isupper()

Checks if all letter characters in byte string are uppercase

b1 = bytes('SPAM HAM CLAM JAM 789', 'ascii')
b1.isupper() ⇒ True

b2 = bytes('Spam Ham clam jam 789', 'ascii')
b2.isupper() ⇒ False

b'5'.isupper() ⇒ False

Formatting

In addition to the methods described in Table 6 below, MicroPython also provides means to format strings within a bytes object that are being printed. This is through the format() method.

More details can be found here.

Example 6

# Demonstrate format() method on strings in a bytes object.
# This works in the same manner that it does on string objects.

# Define some strings
pet = 'dog'
name = 'Sam'
age = 6
size = 'large'
activity = 'walking'
location = 'the park'

# Use these strings to format some byte strings.

# format() with empty placeholders.
b1 = bytes('My pet is a {} and his name is {}.'.format(pet, name), 'ascii')

# format() with numbered indexes.
b2 = bytes('{0} is a {1} {2} and is {3} years old.'.format(name, size, pet, age), 'ascii')

# format() with named indexes.
b3 = bytes('He enjoys {exercise} in {where}.'.format(exercise = activity, where = location), 'ascii')
b4 = bytes('We go to {where} in the {transport}.'.format(where = location, transport = 'car'), 'ascii')

print(b1)
print(b2)
print(b3)
print(b4)
            
Output:

b'My pet is a dog and his name is Sam.'
b'Sam is a large dog and is 6 years old.'
b'He enjoys walking in the park.'
b'We go to the park in the car.'
            
Table 6: MicroPython bytes Formatting Methods
Method Description
lstrip(chars)

Removes characters from the left, based on the argument. If no argument is given, whitespace is removed

b' SPAM 789'.lstrip()
⇒ b'SPAM 789'
b'spam-ham-789'.lstrip(b'spam')
⇒ b'-ham-789'

rstrip(chars) Removes characters from the right, based on the argument. If no argument is given, whitespace is removed.

b'SPAM 789 '.rstrip()
⇒ b'SPAM 789'
b'spam-ham-789'.rstrip(b'89')
⇒ b'spam-ham-7'

strip(chars)

Combined result of applying lstrip and rstrip methods. If no argument is given, whitespace is removed

b' SPAM - 789 '.strip()
⇒ b'SPAM - 789'
b'89spam789'.strip(b'89')
⇒ b'spam7'

replace(old,
     new[, count])

Replaces matching occurrences of old with new.

Optional count specifies number of times to do the replacement.

b'abcd'.replace(b'abc',b'ABC')
⇒ b'ABCd'

b1 = bytes('abcabdabe', 'ascii')
b1.replace(b'ab',b'AB', 2)
⇒ b'ABcABdabe'