Abstract: Although Solaris provides a 32-bit address space, programs encounter minor problems and restrictions beyond 2GB. This application note demonstrates the techniques needed to exploit the 4GB address space.
Until the release of a 64-bit version of Solaris, applications running on Sun machines will have to work within the 32-bit address space. While this is large enough for the vast majority of programs, a few have already reached this boundary.
Unfortunately, it currently isn't completely straight-forward to reach the nearly 4GB limit that the system provides. Instead, many programs run into barriers at 2GB when they attempt to grow by using malloc() (and indirectly, sbrk()).
The purpose of this application note is to demonstrate how to exploit the entire virtual address space.
Although there are undoubtedly more elegant ways of checking your available VM, I abuse the knowledge that "/tmp" is a memory based filesystem to indirectly measure VM. Specifically, the amount of available VM is roughly equal to the available /tmp space, thus:
% df -k /tmp
Filesystem kbytes used avail capacity Mounted on
swap 2407144 3536 2403608 1% /tmp
shows that this system has 2.4GB of available VM.
With respect to accessing the upper 2GB of address space, Solaris has been improved with each successive release. Because of these changes, there are 4 distinct sets of constraints placed on an application (with each newer release becoming easier to work with).
As a software vendor, you can trade off your development costs against the age of the OS you want to support and the requirements you want to place on your user.
The following table summarizes the changes and points to the specific actions needed to use VM beyond 2GB.
| OS Release Description |
Requirements on application |
Requirements on end user |
| 2.6 | No changes necessary | user permissions sufficient to set data limit |
| 2.5.1 + patches | No changes necessary | root permissions necessary to set data limit |
| 2.5.1 | Modify to use mmap() | user permissions sufficient to set VM limit |
| 2.5 | Modify to use mmap() | root permissions required to set VM limit |
% limit datasize 4194303
% limit datasize
datasize 4194303 kbytes
[MPI: We tried to use this magic number of 4194303 and always got error messages
like the following:]
# limit datasize 4194303
limit: datasize: Can't set limit
Trying other magic numbers showed us a different limit:
# limit datasize 3932152
limit: datasize: Can't set limit
# limit datasize 3932151
#
[Note that although the issue is the virtual memory limit, malloc
grows a process via the data segment and the datasize
limit is actually the gating factor. For logical consistency you might
also increase your memorysize limit, but Solaris doesn't appear
to consult this limit when growing a process via malloc().]
Beware that the standard shells have a peculiar definition of the term unlimited. In particular, if you check your resource limits, ie:
% limit datasize
datasize unlimited
you might not realize that, for example, datasize is
actually limited to 2GB (by default). Fortunately, these commands only
lie if the limit is exactly 2GB and will report accurately
if you set your limit to something higher.
If you run the showlimits program from Appendix A,
you can see the actual limits:
% ./showlimits
Current/maximum data limit is 2147479552 / 2147479552
Current/maximum stack limit is 8388608 / 2147479552
Current/maximum vmem limit is 2147483647 / 2147483647
If you're currently developing code which will eventually run on Solaris 2.6, it would be a convenience to your users if you include code which attempts to automatically increase the data limit. In order for it to be compatible with previous releases of the OS (which require root permissions to increase these limits), you might consider letting it silently ignore a failure. Example code might be:
struct rlimit datalim;
/* Select some large value as the upper data limit */
datalim.rlim_cur = datalim.rlim_max = 3500000000UL;
/* Ignore failure in order to be compatible with 2.5.1 */
(void)setrlimit(RLIMIT_DATA, &datalim);
To increase this limit either requires root intervention for a command sequence like:
% su root
passwd:
# limit datasize 3000000
# su - username
%
or more likely, the use of a setuid program which accomplishes
the same thing (see Appendix B).
Unlike the previous examples, since this technique does NOT grow the process via the data segment, you will be required to increase the memorysize resource limit (which IS checked by the OS when you attempt to perform the mmap()).
Note that Solaris does supply an alternative version of malloc() which can take advantage of multiple memory mappings. See the mapmalloc(3x) man page for details, but note the caveat which warns:
Patch 103640 (revision -08 or newer) is externally available at http://sunsolve1.sun.com/sunsolve and internally at Sun at: http://sunsolve.corp/sunsolvei/patchpages/os-5.5.1.html
Also, this patch requires the installation of at least revision -03 of patch 103600.
To determine the current resource limits, compile and execute the following program, showlimits.c:
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h>
#include <sys/time.h>
#include <sys/resource.h>
static void
showlimit(int resource, char* str)
{
struct rlimit lim;
if (getrlimit(resource, &lim) != 0) {
(void)printf("Couldn't retrieve %s limit\n", str);
return;
}
(void)printf("Current/maximum %s limit is \t%lu / %lu\n",
str, lim.rlim_cur, lim.rlim_max);
}
/*ARGSUSED*/
int
main(int argc, char*argv[])
{
showlimit(RLIMIT_DATA, "data");
showlimit(RLIMIT_STACK, "stack");
showlimit(RLIMIT_VMEM, "vmem");
return 0;
}
The following program is an example of a wrapper program which could be used on Solaris 2.5.1 to increase the data segment resource limit.
It can be compiled and used as follows:
$ cc -o wrapper wrapper.c
$ su root
# chmod 4755 wrapper
# chown root wrapper
# ^D
$ wrapper
Initial limits are:
rlim_cur = 2147483647
rlim_max = 2147483647
Updated limits are:
rlim_cur = 3221225472
rlim_max = 3221225472
Note that once the data limit is increased beyond 2GB, the shell built-in commands will report the correct size:
% limit datasize
datasize 3145728 kbytes
Note that for reasons of simplicity, it sets the data limit to an arbitrary value of 3GB. Presumably something more elegant could be arranged for production code. The source for wrapper.c follows.
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
/*
* Just an arbitrary large number for example purposes.
*/
#define THREE_GB (3221225472UL)
/*ARGSUSED*/
int
main(int argc, char *argv[], char*env[])
{
struct rlimit lim;
unsigned long target_vm_limit = THREE_GB;
/*
* Check and print the current virtual memory limit
*/
if (getrlimit(RLIMIT_DATA, &lim) == -1)
perror("getrlimit"), exit(-1);
(void)printf("Initial limits are:\n rlim_cur = %lu\n rlim_max = %lu\n",
lim.rlim_cur, lim.rlim_max);
/*
* Set the hard and soft limits to the target value
*/
lim.rlim_cur = lim.rlim_max = target_vm_limit;
if (setrlimit(RLIMIT_DATA, &lim) == -1)
perror("setrlimit"), exit(-1);
/*
* Check that it all worked.
*/
if (getrlimit(RLIMIT_DATA, &lim) == -1)
perror("getrlimit"), exit(-1);
(void)printf("Updated limits are:\n rlim_cur = %lu\n rlim_max = %lu\n",
lim.rlim_cur, lim.rlim_max);
/*
* Do NOT run the application with "root" permissions.
* Revert to the normal user id.
*/
if (setuid(getuid()) == -1)
perror("setuid"), exit(-1);
/*
* At this point, a "wrapper" program should exec() the appropriate
* application with normal user permissions. For demonstration
* purposes, however, this example will instead invoke a shell with
* the new resource limits.
*
* Note that if it simply called "return", the invoking
* shell would still have its original limits.
*/
(void)execve("/bin/ksh", argv, env);
perror("execve");
return -1;
}
To demonstrate this workaround, the following program will allocate as much memory as it can via malloc() and then allocate more in another memory segment via mmap(). To absolutely prove that the memory is really available, the demo program will then write into every allocated page.
While the program is running, you can inspect the layout of the memory segments as follows:
$ cc -o demo demo.c
$ df -k /tmp # Verify that there is sufficient swap space
Filesystem kbytes used avail capacity Mounted on
swap 3429560 80 3429480 1% /tmp
$ demo > demo.out 2>&1
[2] 1268
$ /usr/proc/bin/pmap 1268
1268: demo
00010000 8K read/exec demo2
00020000 8K read/write/exec demo2
000220002050784K [ heap ]
000220002050784K read/write/exec
ADC000001074224K read/write
EF6F0000 16K read/exec /usr/platform/SUNW,Ultra-1/lib/libc_psr.so.1
EF700000 544K read/exec /usr/lib/libc.so.1
EF796000 40K read/write/exec /usr/lib/libc.so.1
EF7A0000 8K read/write/exec
EF7C0000 8K read/exec/shared /usr/lib/libdl.so.1
EF7D0000 88K read/exec /usr/lib/ld.so.1
EF7F0000 8K read/write/exec /usr/lib/ld.so.1
EFFFC000 16K read/write/exec
EFFFC000 16K [ stack ]
If you can ignore the fact that the address and size columns ran together, you'll see that the 4th entry (labelled "[heap]") is 2050784KB and the 6th entry (which is unlabelled because it is anonymous memory) is 1074224KB for a total allocation of slightly more than 3GB.
The source code for demo.c is:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#define ALLOCSIZE 3200000000
#define DELTASIZE 50000000
int
main(int argc, char *argv[])
{
int fd, i;
char *malloc_space, *mmap_space;
long pagesize = sysconf(_SC_PAGESIZE);
unsigned long mmap_size, malloc_size = ALLOCSIZE;
/*
* Allocate as much memory as possible via malloc
*/
while ((malloc_space = (char*)malloc(malloc_size)) == NULL) {
if (malloc_size < DELTASIZE)
(void)fprintf(stderr, "malloc failed\n"), exit(-1);
malloc_size -= DELTASIZE;
}
(void)fprintf(stderr, "malloc'd %lu bytes\n", malloc_size);
/*
* Allocate enough extra via mmap to reach ALLOCSIZE bytes
*/
mmap_size = ALLOCSIZE - malloc_size;
if ((fd = open("/dev/zero", O_RDWR)) == -1)
perror("open"), exit(-1);
mmap_space = (void*)mmap((caddr_t) 0,
mmap_size,
(PROT_READ | PROT_WRITE),
MAP_PRIVATE,
fd,
(off_t)0);
if (mmap_space == MAP_FAILED)
perror("mmap"), exit(-1);
(void)close(fd);
(void)fprintf(stderr, "mmap'd %lu bytes\n", mmap_size);
/*
* Just to be thorough, test evey page of both allocations to make
* absolutely sure that the memory was really allocated. This will
* take a while.
*/
(void)fprintf(stderr, "Testing the %lu malloc'd bytes ...\n", malloc_size);
for (i=0; i<malloc_size; i+=pagesize)
malloc_space[i] = i;
(void)fprintf(stderr, "Testing the %lu mmap'd bytes ...\n", mmap_size);
for (i=0; i<mmap_size; i+=pagesize)
mmap_space[i] = i;
(void)fprintf(stderr, "done\n");
return 0;
}
One issue to look out for is that a large stack limit can inhibit growing your data segment. Note that even if your stack hasn't grown to be large, the virtual memory space for it is reserved according to the limit value. If you have difficulty growing the heap past 2GB, check to make sure your VM isn't consumed by pre-allocated stack:
% limit stack
stacksize 2097152 kbytes
... no chance of acquiring more than 2GB of data space ...
% limit stack 16000
% limit stack
stacksize 16000 kbytes
... 3+ GB should be available for data space ...