C++: On Using int*_t as Overload and Template Parameters

 
Author:  Follow: TwitterFacebook
Job Title:Sarcastic Architect
Hobbies:Thinking Aloud, Arguing with Managers, Annoying HRs,
Calling a Spade a Spade, Keeping Tongue in Cheek
 
 

In C++, there is a concept of “two types being the same”. I am a Big Fan(tm) of this concept and find it very useful, but sometimes it starts causing trouble. Here is one such example – and a way I am using to deal with it.

Preamble: int*_t Types vs Fundamental Types

Hare thumb up: Sure, we can use static_assert to handle it, but IMNSHO int*_t types provide a much-better-usable and better-readable alternative.In addition to being a Big Fan of “two types being the same” concept, I am even a Bigger Fan(tm) of int*_t family of types (including uint*_t ones). Essentially, int*_t types are a workaround for a long-standing problem coming from the Dark Ages of K&R C, which is that with fundamental C/C++ types (short, int, long, etc.) we cannot be sure what exactly we’re dealing with, so results are so platform-dependent that in many cases we cannot guarantee correctness of our program (what if we’re using int to iterate from 0 to 100000, and int happens to be 16-bit, which is IIRC technically allowed by the standard?). Sure, we can use static_assert(sizeof(int)>=whatever_we_need) to handle it, but IMNSHO int*_t types provide a much-better-usable and better-readable alternative.

One thing to remember in this regard is that

int*_t types are NOT ‘real’ types, they’re merely typedefs to appropriate fundamental types

Most of the time, we don’t really care, but unfortunately, sometimes we do.

Overloads/Template-Specialization Parameters: Beware Mixing Fundamental and int*_t Types

One quite obvious issue when dealing with the int*_t types arises when we’re trying to write an overloaded function, which has both fundamental-type version and int*_t type version: while

#include <stdint.h>
int f(long x) {
  return 0;
}

int f(int64_t x){
  return 0;
}

will compile on some platforms, it won’t compile on some others: in particular, if int64_t happens to be a typedef of long – then we have two definitions of the same function.

The best approach I know to deal with this issue – is to say that

We decide which set of types we’re using (fundamental or int*_t), and use it consistently

(this consistency MUST be maintained at least for one single overload, but I argue we SHOULD keep it at least at sub-project level). In other words – if we’re using any of fundamental types in overloads – we SHOULDN’T use any int*_t, and vice versa.

Another closely related issue is that

As soon as we decided on a set of types, we’d better to define overloads for ALL the types in the set

In particular, if we’re dealing with int*_t types, we should provide overloads for ALL of {int8_t,int16_t,int32_t,int64_t,uint8_t,uint16_t,uint32_t,uint64_t}. And for fundamental types, strictly speaking, we should provide overloads for ALL of {signed char,unsigned char,char /*sic! – char is a distinct type*/, wchar_t, char16_t, char32_t, short, unsigned short, int, unsigned int, long, unsigned long, long long, unsigned long long} <phew />1.

If we’re not specifying ALL the overloads in a set, it MAY result in compiler choosing a counter-intuitive overload for your call. The most egregious example I heard of, is f() having both overloads f(char) and f(int), but f((signed char)(0)), in spite of char being signed on the platform, counter-intuitively calling f(int) because of char being distinct type, and then compiler choosing int overload as the one which requires only promotion rather than conversion. Pretty ugly if you ask me 🙁 .

Also, let’s note that dealing-with-overloads is NOT the only case where we need to follow this approach – at the very least the very same logic applies to the template specializations.


1 Of course, if we’re 100% sure that we’re not using some of these types, we can safely ignore them

 

Transition Between int*_t and Fundamental Types

NB: up to this point, things were fairly obvious, now we’re getting to the interesting part 🙂 

It is all grand and dandy, and keeping things consistent will work – until we have to convert from one set of types to another one. Let’s consider the following scenario:

  • we have function f(T) and faithfully wrote a full set of int*_t overloads for it (int8_t, int16_t, int32_t, and int64_t), which works fine
  • however, subproject-which-uses our function f() decided on using fundamental types rather than int*_t types. This kind of things happens all the time, especially if we’re using a 3rd-party library (or a writing a library ourselves).

At this point, we have a problem: while we did implement all the int*_t overloads (and DID provide all the logic which might be necessary), from a formal point of view it doesn’t cover all the necessary cases. The thing is that while we have four int*_t overloads (int8_t, int16_t, int32_t, and int64_t), in “fundamental” type space there are at least five distinct types (char, short, int, long, long long)2, so at least some of fundamental types are bound to be left without their int*_t counterpart. This leads to unpleasant problems such as:

int f(int8_t x) {
  return 0;
}
int f(int16_t x) {
  return 0;
}
int f(int32_t x) {
  return 0;
}
int f(int64_t x) {
  return 0;
}

int main() {
  int x = 0;
  return f(x);//ERROR under 32-bit ARM GCC
              //  under x86-64 it is long long x which leads to ERROR
}

The problem here, of course, is that we don’t know which of fundamental types is left without its int*_t counterpart, so we cannot simply add it to the set of overloads.

The best way I found to deal with this issue, is rather bulky – but it does work pretty well. The main idea is that

while ‘missing’ type is formally different – on vast majority of platforms it differs from one of int*_t types only by name

In other words – on most existing platforms, while each fundamental type behaves exactly as one of int*_t types (i.e. it quacks like a duck, swims like a duck, and looks like a duck), it is NOT recognized as such. However, apparently we can fix it ourselves:

//{ this is done ONCE for ALL the overloads; taken from github kscope project
template<size_t N>
struct kscope_sint_by_size;
template<>
struct kscope_sint_by_size<8> { using type = int8_t; };
template<>
struct kscope_sint_by_size<16> { using type = int16_t; };
template<>
struct kscope_sint_by_size<32> { using type = int32_t; };
template<>
struct kscope_sint_by_size<64> { using type = int64_t; };

template<class T>
struct kscope_normalized_signed_integral_type {
  static_assert(std::is_integral<T>::value);
  using type = typename kscope_sint_by_size<CHAR_BIT*sizeof(T)>::type;
  static_assert(sizeof(type)==sizeof(T));
  static_assert(std::is_signed<type>::value);
};
//pretty much the same thing should be done for unsigned

template<class T>
 struct kscope_normalized_integral_type {
   static_assert(std::is_integral<T>::value);
   using type = typename std::conditional< std::is_signed<T>::value,
     typename kscope_normalized_signed_integral_type<T>::type,
     typename kscope_normalized_unsigned_integral_type<T>::type
   >::type;
 
   static_assert(std::is_integral<type>::value);
   static_assert(sizeof(T)==sizeof(type));
   static_assert(std::is_signed<type>::value == std::is_signed<T>::value);
 };
//} this is done ONCE for ALL the overloads 

//{ this should be done per overload
int f_(int8_t x) {
  return 0;
}
int f_(int16_t x) {
  return 0;
}
int f_(int32_t x) {
  return 0;
}
int f_(int64_t x) {
  return 0;
}
int f_(uint8_t x) {
  return 0;
}
int f_(uint16_t x) {
  return 0;
}
int f_(uint32_t x) {
  return 0;
}
int f_(uint64_t x) {
  return 0;
} 
template<class T>
int f(T x) {
  using TT = typename kscope_normalized_integral_type<T>::type;
  f_(TT(x));//this cast is a no-op(!)
}
//} this should be done per overload 

//using our overload
int main() {
  //return f_(char(0))+f_(short(0))+f_(int(0))+f_(long(0))+f_((long long)(0)); //ERROR on ANY platform
  return f(char(0))+f(short(0))+f(int(0))+f(long(0))+f((longlong)(0)); //WORKS under MOST platforms
} 

Godbolt link.

Notes:

  • Hare pointing out:The only thing we're doing here, is changing the name of the type, NOT its behaviorthe point of the code above is that with the precautions made, cast to TT type within function f() is a no-op.
    • Indeed, TT has the same signed-ness, and exactly the same size as original type T (=”it quacks like a duck and swims like a duck”). The only thing we’re doing here, is changing the name of the type, NOT its behavior.
    • NB: while this does work on all the major platforms/compilers I know about, if somebody knows of a platform where it doesn’t3 – please LMK.
  • if desired, template wrapper (f(T x) in the example above), can be implemented as a macro with the only parameter of the macro being function name. In other words – there is absolutely no logic there, merely boilerplate (which is bad because it is pointless code, and which is good because it is quite difficult to make a mistake there).
  • exactly the same approach works for using int*_t in template specializations (actually, it was the reason why I had to use it in kscope in the first place).

2 strictly speaking, there are at least 14 listed above:signed char,unsigned char,char, wchar_t, char16_t, char32_t, short, unsigned short, int, unsigned int, long, unsigned long, long long, unsigned long long
3 beyond obvious platforms-which-don’t-support-int*_t

 

Summary

We took a look at using int*_t types as overloads (and template parameters). While problems on this way are admittedly rare, IF you happen to run into them – an approach outlined above should help.

 

Don't like this post? Comment↯ below. You do?! Please share: ...on LinkedIn...on Reddit...on Twitter...on Facebook

Acknowledgement

Cartoons by Sergey GordeevIRL from Gordeev Animation Graphics, Prague.

Join our mailing list:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.